(put title here)

Introducing TPC-D Version 2

1.        Introduction

The TPC-D Subcommittee is preparing a new version of its Decision Support benchmark, Version 2. This overview summarizes the goals and content of the TPC-D Version 2 specification, and highlights differences from the existing Version 1. Version 2 will increase the relevance of the TPC-D benchmark to today's decision support environments and will provide a solid foundation for future revisions. At the time of this writing, the Version 2 specification is being readied for TPC Company Review.

2.        Background

Version 1 of TPC-D was approved in April 1995. The first results were published in December 1995, with the number of results steadily increasing since that time. To date, 39 results have been published by 6 vendors. The benchmark has been through three minor revisions that addressed questions and issues raised in the course of its use. Changes made by these revisions maintained comparability to existing results in each case. The currently active revision is Version 1.3.1. More information on TPC-D can be found at <www.tpc.org>

3.        Why Version 2?

In the interest of retaining market relevance, the TPC-D subcommittee began exploring the components of a major benchmark revision in mid-1997. Overall, the goal of such a revision was to keep TPC-D up-to-date and accurately reflect the increasingly complex nature of decision support workloads. In particular, the group considered a more complex data warehouse schema involving multiple subject areas, the addition of a "star" or dimensional element to the schema, more realistic data population including skewed data and possibly abstract data types, and a larger and more varied query set. In addition, more realistic execution models, including mandatory multi-user or multi-stream execution (which Version 1 had not required) were discussed.

With the magnitude of enhancements being considered, it soon became clear that the best plan for revision would be to split the improvements into two phases. Version 2 includes those changes that could be implemented relatively quickly, while the more complex improvements (including data skew) will be tackled in Version 3. The Version 2 improvements include:

Requiring multiple simultaneous query streams, as the vast majority of real decision support systems support growing user populations;
A simpler single reported performance metric, making it easier to compare results from different vendors;
Increasing the number of queries from 17 to 22, to add additional query functionality and complexity to the workload;
Reducing the number of query variants from which test sponsors are allowed to choose. This increases the comparability of the various results;
More realistic requirements with respect to database durability;
More efficient parallel generation of data (using the DBGEN program) for database load.

Because Version 2 introduces changes in execution rules and additional queries, Version 2 results are not comparable with Version 1 results. Section 5 describes the content of Version 2 in more detail.

4. When will Version 2 be effective?

The current schedule foresees the availability of TPC-D Version 2 in early November 1998. Version 1 results will no longer be published after early January 1999. TPC rules will remove all Version 1 results from the TPC-D results list in early June, 1999. Version 2 is expected to be the most current revision for at least two years.

What's new in Version 2 of TPC-D?

Version 2 addresses many of the changes discussed by the committee. It introduces a number of changes in execution rules designed to more closely simulate actual environments and limit the number of potentially non-comparable implementation options. The following subsections summarize changes implemented in Version 2.

5.1 Mandatory Multistream Execution in the Throughput Test

Version 1 requires a single-stream Power test. The multi-stream Throughput test is optional, and no minimum number of concurrent streams is specified. Test sponsors may choose to use single stream query timings to compute the "Throughput" (QthD@Size) metric instead of running a multistream Throughput test, and many have done so. Version 2 requires the execution of a multistream Throughput test. There is a minimum number of query streams specified for each allowed Scale Factor (test database size). The minimum stream count increases slowly as a function of the database size. For example, a 1 GB TPC-D result will require at least 2 streams in the Throughput test, a 100 GB result 5 streams, and a 1 TB result 7 streams.

5.2 Enhanced Query Set

Version 1 specifies 17 SQL queries and two so-called update functions. Version 2 has added 6 new queries to the existing set, and deleted 1 (Version 1 Query 13, which many vendors could execute in a few seconds) to bring the total to 22. The update functions have not changed, except that they have been renamed to "refresh functions." While the 6 new queries were formally proposed by several vendors, most of them were derived from end-user suggestions or customized uses of the Version 1 benchmark. The 6 queries introduce some new functionality into the set. They include:

A left outer join query
A query with challenging nested subqueries
An aggregate with a HAVING clause
An interesting OR query with many predicates
A query combining EXISTS and NOT EXISTS
A query with multiple instances of the SUBSTRING function

As in Version 1, queries must be submitted exactly as specified, except for certain minor modifications permitted in certain circumstances. The large pool of query variants (more substantial departures from the "base" queries in the specification, which require special approval) has been reduced from 32 in Version 1 to only 5 in Version 2. Of these 5 variants, 3 accommodate vendor-specific syntax that maps almost exactly to the SQL CASE statement. As in Version 1, test sponsors must demonstrate correct answers to the queries run against a small "qualification" database; however, the size of the qualification database has been increased from 100 MB to 1 GB.

5.3 Simpler Performance Metric

Version 1 requires disclosure of two performance metrics. The Power (QppD) metric is based on the single-stream Power test, while the Throughput (QthD) metric may be based on an actual multi-stream Throughput run, or on recalculated single-stream results. Version 2 combines the Power and Throughput metrics into a single new performance metric called Composite Queries-per-Hour (QphD). This quantity is the same as that used in the numerator of the Version 1 price-performance metric. It combines single- and multi-user performance in a balanced way using the already familiar elements of Power and Throughput. A single performance metric will be easier to use for benchmark marketing and comparison purposes.

5.4 More Realistic Database Durability Requirement

TPC-D Version 1 requires a number of "ACID" (Atomicity, Consistency, Isolation, Durability) properties of the database used to implement the benchmark. Version 2 makes some changes to the test system's durability requirements. Durability is the ability of the database to withstand and recover from single-point failures of durable media. Two common ways of assuring durability are:

Use of RAID (Redundant Arrays of Inexpensive Disks) storage devices
Recovery from media failures using database backups and rollforward recovery from a log

If a test sponsor relies on the second option, Version 1 permits a demonstration of backup/rollforward functionality on the small qualification database, but does not require a demonstration using the larger test database. In particular, vendors are not required to make or have database backups of the test database, even if its durability depends on them. Since availability of decision support databases is becoming increasingly important, the subcommittee decided to require the durability of the actual test database in Version 2. If RAID is not used, and database durability depends on a backup, then the backup must be performed and timed as part of the database load test. Furthermore, the cost of backup storage and media must be included in the priced benchmark configuration.

5.5        Stricter Implementation Options

Two minor changes were made to implementation rules in Version 2. They affect the refresh (update) functions and the implementation of fixed-length text columns. These changes reduce the allowable variations in benchmark implementations and thus enhance comparability of benchmark results.

Between benchmark runs, Version 1 allows test sponsors to back out or reset the effects of the refresh functions, which insert and delete rows into and from the test database. Alternatively, they can choose to use the so-called "evolve" option, in which the effects of the refresh functions are not undone between runs. Version 2 removes the option to undo the effect of the refresh functions between runs. The intent is to more accurately represent the steady-state effect of data maintenance (and disorganization) in an actual decision support system.

Version 1 defines "fixed" and "variable" text types. Version 1 allowed test sponsors to implement a fixed length text field using a VARCHAR data type, as long as text was blank-padded for display. Because of the potential advantage of this approach, Version 2 imposes stricter requirements on the use of VARCHAR data types to implement fixed-length text fields.

5.6        Data Population Improvements

TPC-D Version 1 supplied the database generation program DBGEN written in C that has been successfully ported to a large variety of platforms. It enables serial or parallel generation of flat files for database loading, and can be modified to achieve inline loading. DBGEN has been modified in two significant ways. As in Version 1, it is possible to generate the data from which the database is loaded in multiple steps, potentially in parallel. However, unlike Version 1, parallel data file generation no longer requires the time-consuming building of "seed files" used for random number synchronization. The Version 2 DBGEN program also generates more realistic text to populate the "Comment" field found in most tables. The new Comment field population offers the chance for interesting text searching queries to be added to the benchmark at a later date. TPC-D Version 2 must be run against a database generated by DBGEN Version 2 to obtain correct query results.

The Version 1 schema included the so-called TIME table to accommodate SQL implementations that did not support date-time data types. Since it was never used in any published result, the TIME table was removed from the schema in Version 2. This made it possible to remove all the query variants whose sole purpose was to enable use of the TIME table.

6        Conclusion

Since its introduction in 1995, TPC-D Version 1 has become a very popular and useful benchmark. Customers have used it to compare DSS performance offered by the numerous vendors with published results and vendors have used it to tune their hardware and software. Like all good industry-standard benchmarks, on-going modification is important to ensure it continues to stay up-to-date with the ever-changing computing environment. Version 2 takes this first step for TPC-D and prepares the way for further enhancement with Version 3. The TPC-D committee is confident that these changes move the TPC-D benchmark in a positive direction and looks forward to feedback from the community on how to improve it even further.

[Revised May 11, 1998]