Columns
Why Teradata
Caveat emptor
Beware the unreasonable claims of some data warehouse vendors.
by Dan Higgins
Ever hear advertisements like these?
Drink Hyper-Trim and you will lose 10 pounds in one week without exercising! … With the new Wing from Golf Master you will hit the ball straighter and 50 yards farther! … Make $5,000 a week from the comfort of your own home—just use the plan from my new book!
It is nearly impossible to avoid sales and marketing claims that grossly overstate the value of a product or service. They imply that an extreme case is the norm. Ultimately, this strategy risks having the opposite effect with its prospective customers: Because most people realize these advertisements are only borderline truth, they simply dismiss them.
My dad can beat your dad
Over the past few years, a number of data warehousing technology vendors have made similar outlandish claims, exaggerations and misleading statements. They tell prospective and current customers:
Our technology is 10 to 50 times as fast as other technologies! … Our products are half the cost of our competitors’! … Our platform set a world record in transactions per second! … In a recent customer test, our product was six times faster than the one currently in use!
It has become a game of one-upmanship as companies attempt to top one another with hyperbolic marketing. First, the system was 10 times as fast. Then 50. Then 100, and then 300.
Just watch: In a couple of years some vendor will be announcing technology that will enable the total population of a major country to have real-time analytics access to hundreds of exabytes of data. And you won’t need any IT staff, because it will all be handled in the cloud where the proverbial “miracle” occurs.
Of course, not all advertising claims are outlandish, nor are they all false. Some might even state a bit of truth. It is possible for a query to run 10, 50 or even 100 times as fast on one system as it does on another. In fact, it is not unheard-of for queries to run at these faster speeds on the same system. All it may require is a simple change to a line of code or a slightly different optimizer plan.
Furthermore, it should come as no surprise that performance on a lightly loaded laboratory system with a few users is much faster than a heavily loaded production system with hundreds of users. Often, the performance tests done on the lightly loaded system measure only a single query at a time on a system optimized for that particular scenario.
Misleading comparisons are especially possible when you consider that the technology for a production system is most likely many years older than the brand-new technology used in the laboratory tests. For instance, after a new Teradata software version is released, performance improvements with certain queries are not uncommon, even on an older system. With the implementation of Teradata 13, a 30% overall performance improvement is achieved, with some workloads performing as much as 20 times faster. The point is, performance comparisons are nearly always misleading because the user would see improvements on any system, regardless of the age of the technology.
Take my word for it
Misleading statements are not limited to performance or cost. Vendors also commonly bend industry terminology in an effort to manufacture a perception about their technology. Too often, “scalability” is claimed when it applies only to storing more data, rather than supporting more concurrent applications or users, providing more extensive data models, etc.
A few years ago, a vendor touted linear scalability because queries that accessed only one month’s data would not run longer when additional months of data were added. In reality, the data was partitioned, so the queries would not run longer anyway. This is not linear scalability as Teradata customers have come to understand.
Similarly, some vendors claim they can do “enterprise data warehousing” because their system promotes leveraging multiple platforms throughout the organization. But managing multiple platforms defeats the purpose of having an enterprise data warehouse.
And while simplicity can certainly be a system attribute, some merchants use this term to disguise woefully lacking functionality. Once this “simple” system is installed, however, the data warehousing customers typically end up placing considerable burden on IT and end users to either develop that functionality in-house or suffer without it.
Distinguish apples from oranges
Technologies are not all the same. Most products in the market have substantial differences depending on the job, with advantages and disadvantages, strengths and weaknesses. In some benchmarks, a certain technology might excel at one type of query or workload and struggle with another.
Because most vendors don’t provide the full story in their marketing claims, customers can become victims if they don’t do their own research. Prospective customers need to know the answers to questions such as:
- What is the truth and what are the details surrounding a specific claim?
- How are vendors using industry terminology? To gain the advantage, do they beef up the meaning behind words and phrases like “scalability,” “simultaneous versus concurrent users,” “user data space,” “simplicity,” “ad hoc” and so on?
- On what basis are different platforms compared? What are the details?
- What are the system configurations and product generations and releases?
- What are the tests? How are they run? How are they measured?
- What are all of the results, not just the ones that look impressive?
Ultimately, it falls to the customers to dig into the facts before establishing a technology strategy or making an investment that might prove unwise. It is far too difficult to develop industry-standard measures and comparisons that can approximate even the simplest data warehousing workloads.
The Transaction Processing Performance Council, a consortium of vendors of which Teradata is a member, has attempted to establish these standards, but for the sake of repeatability and to control costs, the tests have been simplified to the point of irrelevance.
Uncovering the truth is not easy, and it is far more difficult with new, unproven products.
What I meant was …
Why is it necessary to thoroughly investigate any vendor’s technology claim? Because the cost of an ill-chosen product can far outweigh its purchase price or total cost of ownership.
The old adage “too good to be true” is often correct. Many people have purchased golf clubs or dietary supplements or got hooked into get-rich-quick schemes only to be disappointed.
Likewise, based on vendors’ claims and promises, many companies invest in data warehousing technology only to find they must spend significant time and money—often years and millions of dollars—trying to make the technology work. And after suffering through one unwise decision, many organizations are gun-shy about investing in what would be a far wiser choice. Instead, they continue to struggle with their original technology, or they abandon their vision for a successful data warehouse and associated business solutions.
Thorough evaluation of alternative technologies can minimize or eliminate the high risk and considerable cost of an ill-informed decision. Even better, by asking probing questions and applying that knowledge on new product purchases, the likelihood of success in data warehousing is greatly increased.
So don’t just take every claim you hear as truth. Dig deep, ask questions and choose wisely.
Dan Higgins has been in the IT industry for 35 years with experience in IT systems architecture design, technology evaluation, and data warehouse design and implementation. He leads a Teradata sales support team of data warehousing subject matter experts.
Photography by iStockphoto