TPC photo
The TPC defines transaction processing and database benchmarks and delivers trusted results to the industry
    Document Search         Member Login    
     Home
About the TPC
Benchmarks
      Newsletter
      Join the TPC
      Downloads
      Technical Articles
      TPCTC
      Performance-Pulse

TPC Benchmark Status
March, 1999

By Kim Shanley, TPC Administrator

TPC Benchmark Status is published about every two months. The first and primary purpose of the newsletter is to keep interested parties informed about the content, issues, and schedule of the TPC's benchmark development efforts. The second purpose is to invite new members to join these important development efforts. We've already outlined most of the reasons for joining the TPC in another article, Why Join. To receive the status report by email, please click here.

Quick Summary
The TPC held a General Council meeting February 11-12 in Houston. There wasn’t much news to report from the TPC-C and TPC-W Subcommittees at the meeting, so I will spend my energies reporting on some rather momentous TPC-D decisions. I ask the reader for a little patience as I try to sort through the complex and fluid TPC-D situation for you.

Background to Major Changes to TPC-D
TPC-D was developed to represent an ad hoc business environment where users submitted more or less random queries against a data warehouse. The queries are designed to be complex and require significant processing. At the beginning of TPC-D's existence, all this was true. This has changed over the course of the last year, and particularly the last few months as Version 1.X was closed. The execution times on several TPC-D queries have plunged nearly to zero. Why?

In the course of the last year, various database vendors have released new technology which enables the building of structures (similar to indexes) that contain precomputed query aggregates. Without any changes to query text, queries can access these structures transparently at run time and quickly calculate the value of needed aggregate expressions. This technique moves much of the work that has previously been part of the query execution of TPC-D into the database load phase. Everyone who has discussed this at the Council agrees that this technology is valuable to end-users. For certain types of business environments, this technology does improve query performance many fold. It was also generally agreed that TPC-D helped bring these new technologies forward, and that while they are causing the TPC to consider redefining its benchmark models, end-users are benefiting from these technology advances.

The aggregate technology is very useful when users (typically very knowledgeable users like database administrators) know the queries and the domain well in advance, can create auxiliary structures like aggregated columns, and can optimize their databases to run these queries. Many have called this type of environment a "business reporting" environment. The problem is that TPC-D was intended to represent an ad hoc environment in which queries are submitted on a random basis and are not known in advance.

The TPC’s February Decision
At the February General Council meeting, the Council decided that TPC-D was one benchmark trying to represent two very different business environments. The first is the environment in which users know the queries very well and can optimize their DBMS to execute these queries very rapidly (business reporting environment). The other environment is the original ad hoc environment in which users don’t know the queries in advance and the execution times can be very long. The Council voted to issue a mail ballot to the general membership asking it to divide the TPC-D benchmark into the following two benchmarks:

  • TPC-R (Business Reporting Benchmark). In light of the new DBMS technology application to the TPC-D workload/rules, the TPC has modified the TPC-D Version 2.0 and created Version 2.1. TPC-D Version 2.1 eliminates the references to TPC-D’s ad hoc business model. The language has now been modified to reflect the fact that TPC-D, as it currently stands, is more representative of the business reporting environment. If the TPC-R mail ballot passes, then Version 2.1 of TPC-D will become TPC-R Version 1.0 and TPC-D Version 2.1 will disappear.
  • TPC-H (Ad Hoc Benchmark). The TPC-D Subcommittee created a new ad hoc benchmark that restores the "ad hocness" of the original benchmark workload. The Subcommittee used TPC-D Version 2.0 as the baseline specification but then added language that restricts the use of auxiliary structures (such as indexes and aggregates) as well as regulating the horizontal partitioning of tables.
This mail ballot to the general membership will be issued in late February and by late April 1999, we should have the results. If the mail ballot is approved by two-thirds of the members, TPC-D will disappear and we’ll have TPC-R and TPC-H in its place. The strong majority of TPC members who voted to issue this mail ballot feel that by dividing TPC-D into these two different benchmarks, two goals would be accomplished:

  1. Two benchmarks (TPC-R and TPC-H) would better represent two very different decision support business environments.<.li>
  2. Performance specialists working in the TPC-R and TPC-H areas could move more quickly to enhance the TPC-R and TPC-H specifications, as the people working in each area could focus their energies on a single target.
What’s the Status of TPC-D Now?
  • TPC-D Version 2.0.1 (which retains the original ad hoc workload description) remains a valid TPC-D benchmark. Anyone can publish a result on this specification.
  • TPC-D Version 2.1 (described above) will become mandatory on April 11, 1999.
What’s the Status of TPC-D Results Now?
There are over 100 TPC-D Version 1.X results on the TPC results list and these remain valid until mid-August, 1999 unless withdrawn. According to our rules, all Version 1.X results can be compared with one another fairly. But you might ask, "what about the application of these new DBMS technologies over the last several months? How do you compare these latest results with earlier Version 1.X results?" There are three points to made here:

  1. Everyone runs the test under the same rules within a major version of the benchmark, so all results can be compared fairly;
  2. Do "some" of the latest Version 1.X results show the gains from these latest DBMS technologies? Yes, they do; Having said that, all companies could apply these technologies under the Version 1.X rules, so again, the benchmark playing field is level.
  3. TPC-D Version 2.X has more queries than Version 1.X and other significant differences, so Version 1.X and Version 2.X results should not be compared. With the change to the wording in Version 2.1, TPC-D can now be thought of a business reporting decision support benchmark.
Bottom Line?
Yes, there’s no getting around it. The creation of new TPC benchmark definitions and the shifting of TPC business models are confusing. But we shouldn’t lose sight of what’s truly important. Whatever the state of our benchmark world, the world of end-users is improving. The rapid advances in DBMS technology are providing a substantial real-world performance boost to end-users. To the degree that TPC provided the incentive to make that happen, we’ve accomplished our major mission.

All Benchmark Status Reports
 

Valid XHTML 1.0 Transitional Valid CSS!