You Need a Simple Answer to a Complex Problem
Analysts are challenged to capture and interpret data that is spread across various systems.
Big data has changed the analytics game. The range of data management and analytics engines continues to expand rapidly. As a result, businesses are forced to use multiple solutions to get the answers they need. Engines offer myriad features at various price points to keep up with organizations integrating big data analytics into their daily operations. Each engine is a natural home for particular types of data and analytics. But using multiple engines requires moving and transforming data, along with IT help, to enable businesses to gain insights and value that were previously out of reach.
What organizations need is a simple way to leverage all of their analytics resources. To that end, understanding the capabilities each engine offers is crucial in order to maximize its potential. Plus, new technologies are needed to harness the capabilities of individual engines while cutting through the complexity associated with data movement and hetero-geneous interfaces.
So Many Choices
Historically, there has always been a lot of diversity of analytics techniques and options, and it’s no different today. With so many choices and capabilities available, no single platform can do everything on its own. Businesses need multiple platforms, and they have to understand how to easily exploit them.
Although a world with multiple analytics engines is inevitable, analysts should focus on the story data can tell, not where it resides.
The downside to so many options is the inevitable complexity that arises. Data is stored in various analytics systems, with each one handling different types of processing and information.
In addition, many new big data technologies have primitive interfaces and languages that limit end-user adoption, plus the need to integrate results from different engines can bog down operations. This can result in a disjointed and uncoordinated environment that does not deliver the full business capabilities.
Understanding the capabilities each engine offers is crucial to ensuring that organizations derive the full value from their analytics. That is why they need to consider these factors when evaluating alternatives:
A wide range of database options is available for performing analytics, including graph, columnar and massively parallel processing (MPP) relational database management systems (RDBMSs). Hybrid models are also available that combine database categories in a variety of ways. For example, some vendors have incorporated columnar and graph technology into MPP databases.
Memory and Disk Types
Engines can be configured with different storage types to achieve faster performance or to minimize costs. They can be configured to have varying amounts of RAM or be designed to process all data in-memory. Spinning disk options are available for high or low I/O and CPU, while more expensive solid-state drives offer very fast data retrieval.
Analytics engines can support various design patterns, including a data warehouse, data lake or discovery platform. Although technologies and patterns have historically had strongly correlated relationships, that is no longer the case. Instead, vendors have incorporated essential features in their technologies that deliver many key aspects of the design patterns.
The revenue potential of analytics capabilities must be evaluated against the costs, including capital expenditures and operational expenses related to development, usage, maintenance, support and data center resources. The wisest course is to pursue value and let the economics of the data and platforms dictate the analytics use. No individual platform is perfect—businesses need more than one solution. The question is how to make exploiting multiple platforms easier.
Pay Attention to the Story, Not the Location
Although a world with multiple analytics engines is inevitable, analysts should focus on the story data can tell, not where it resides. After all, paying analysts to be data movers—transferring data in and out of different engines—is not a good use of resources.
Analysts should be able to find and access the data they need in a way that allows for ease of use, simplicity and the ability to run whatever analytics are needed. The information should be tapped where it resides to significantly lower the cost and time for analysis.
Organizations are faced with a lot of choices for engines. When selecting them, it’s important to base the decision on design, business needs and budget.
Dan Woods is CTO and founder of CITO Research. He has written and co-authored more than 20 books about business and technology and has a column on Forbes.com.