TPC Database Benchmarks
As most readers of data warehouse related blogs will no doubt know, the Transaction Processing Performance Council (TPC) define database benchmarks to allow comparisons to be made between different technologies.
This is the most relevant TPC benchmark for those of us focussed on data warehouse platforms.
TPC-H benchmarks are executed at a given scale point ranging from 1GB up to 100TB:
TPCH1 = 1GB
TPCH10 = 10GB
TPCH30 = 30GB
TPCH100 = 100GB
TPCH300 = 300GB
TPCH1000 = 1,000GB/1TB
TPCH3000 = 3,000GB/3TB
TPCH10000 = 10,000GB/10TB
TPCH30000 = 30,000GB/30TB
TPCH100000 = 100,000GB/100TB
Data used to populate the TPC-H schema is generated using the DBGEN utility which is provided as part of the TPC-H download.
The schema used in the TPC-H benchmark is retail based and contains the following tables:
Each table contains either a fixed number of rows or a number related to the scale factor in use.
Unsurprisingly, the 'lineitem' table contains the most rows, which is 6,000,000 x scale factor e.g. 600,000,000 rows for scale factor 100 (TPCH100).Column names and data types are provided for all tables.
For those using MPP databases such as Teradata, Greenplum or Netezza you'll have to decide your own data distribution strategy.
The queries that are executed against the populated schema are generated using the QGEN utility which is provided as part of the TPC-H download. Minor query modifications are allowed to take into account differences between DBMS products.
There are a total of 22 'SELECT' queries ('query stream') that must be executed against the populated retail schema at a given scale factor. Each query corresponds to a business question. There are also 2 refresh functions ('refresh stream') that add new sales and remove old sales from the database.
Database Load Time
The elapsed time required to generate & load the data to the database must be recorded. The time to execute other supporting tasks such as table creation, index creation & statistics collection must also be recorded.
Once the data is loaded the real fun and games can begin. The performance test consists of both single-user power and multi-user throughput tests.
The power test consists of the first refresh function followed by the 22 query set and lastly the second refresh function.
The throughput test consists of a minimum number of concurrent runs of the 22 queries ('streams'), as determined by the scale factor:
SF1 = 2
SF10 = 3
SF30 = 4
SF100 = 5
SF300 = 6
SF1000 = 7
SF3000 = 8
SF10000 = 9
SF30000 = 10
SF100000 = 11
The throughput test is run in parallel with a single refresh stream.
The set of 22 queries is run in a sequence specified in Appendix A of the TPC-H guide and is dependant on the number of streams.
The timing of all queries is measured in seconds.
The metrics captured by the power and throughput tests are as follows:
composite query-per-hour (QphH@Size)
Detailed explanations as to how these metrics are computed are available in section 5 of the TPC-H guide. There are some lovely equations in there for maths geeks to enjoy!
Vendors that submit results are required to provide an executive summary in addition to a full disclosure report. All of the shell scripts, SQL, log files etc are also provided as supporting files. There are no secrets here!
Benchmark performance results have been submitted at the following scales:
Perhaps interestingly, the fastest system at all scale factors is currently the Exasol database running on Dell PowerEdge servers. We'll let you peruse to TPC web site to see how they did it.
The TPC-H benchmark is primarily of interest to technology vendors to show us what they've got in their locker.
Here at VLDB we're data consultants, not tech vendors, so why the TPC-H interest?
Well, it is possible to use a cut-down version of the TPC-H benchmark to assess the real-world capability of various database platforms.
Why believe the hype when you can test database performance with a consistent set of queries against a real-world(ish) schema and usable data?
Well, that's exactly what we've been doing for several years now.Specifically, we use a set of benchmark tests to assess cloud database performance. Also, as cloud platforms are said to suffer the 'noisy neighbour' problem, we also run benchmark tests over extended periods to test the variance of database performance over time for cloud platforms.
Some of the results are interesting...very interesting :-)