On The Database Market for Analytics. Q&A with Carl Olofson

Q1. What has impressed you most in the last months in the database market? 

I guess the popular answer would be the infusion of GenAI into the database sphere, but honestly, the degree of penetration of AI into structured data is peripheral at best. Part of the reason is a woeful lack of governance. Broad-based unanticipated queries require data that is consistently defined and valued and temporally aligned. This is a big challenge with analytic data because it is culled from many different databases and other data sources. All that data must be defined according to a model that enables comparison and calculation, which means the definitions must align, and for data that is defined the same way, there can be no inconsistency in data values. Coordinating the data temporally, that is, ensuring that a query set at a given point in time always returns results from sources that are valid as of that time, is also critical. On top of all that, there must be business definitions of the data, based on a consistent data glossary of terms. There’s a lot of work involved, but I think with GenAI assisted tools it can be accomplished much faster than was possible in the past. I am seeing, in recent user surveys, an increased awareness of the need for this level of governance, and an increased focus on the part of leading vendors in this direction.

Q2. It seems that there is an increasingly common belief that one can use one database for all manner of analytics. What is your take on this? 

This is utterly untrue. There is a wide variety of analytics, some general and some specialized, and each requires a data environment that delivers functionality that is attuned to the needs of the analytics being performed.

Q3. So, not all analytics are the same. Can you clarify this? 

In general, there are four levels of business objectives addressed by analytics. These are 1) strategic decision making: using a broad range of data from across the enterprise and over a period of years to make strategic decisions regarding the future of the enterprise, subject to constant adjustment, 2) tactical decision making: using data relevant to current business conditions, sometimes broad, sometimes narrow, in the context of historical data, to make adjustments to current business operations, 3) operational decision making: using current data to make intra-day adjustments to business operations, and 4) real time decisions: using streaming, sensor, data feed, and CDC data to take business critical minute, second, and subsecond level actions. In addition, there are specialized analyses involved such as timeseries analysis for detecting trends and patterns in the moment.

Q4. Does each class of analytics calls for a specific DBMS that addresses its key needs? 

Yes, indeed. There are discrete DBMS products aimed at addressing each of the types of analysis that I have mentioned. Some emphasize speed of data acceptance and interpretation, some stress size of database and speed of complex queries, some stress flexibility and speed of deployment. Each serves a purpose.

Q5. If this is the case, is there a danger that companies will end up having a proliferation of different databases that in the end need to be coordinated?  

Often, these types of analytics are specific to particular business functions and therefore deployed by specific departments or business units. They don’t interfere with each other, so coordination is not necessary. When these various levels of analytics need to agree with each other, then a strategic approach to technology acquisition must be taken. Also, some companies may balk at having many subscriptions to different services provided by different vendors, so business simplification may mean that some users will have to compromise regarding their choice of technology.

Q6. What are the databases that you consider most promising for analytics? 

For strategic decision making, general relational DBMSs are the best suited; those that can be deployed effectively as platforms for enterprise data warehouses. For tactical decision making, the picture gets more complicated. The data warehouse may still be involved, but as a background resource along with data collected into a fast loading and quick processing DBMS, and perhaps including a data lake. Data lakes are also key for building new statistical models by data scientists based on broadly collected data. For tactical operation, one might favor a so-called “in-memory” or memory optimized DBMS, which will also serve in dealing with streaming data. For broad based analysis across a variable range of data in the enterprise, a lakehouse may be required.

Some vendors offer some combinations of these capabilities together in one or several products; others offer a discretely designed DBMS for each of these cases.

Q7. What about GenAI and databases? Any thoughts on this? 

Today, GenAI plays a minor role in the structured data world, powering tools that aid in database design, database application development, and end-user natural language query. Once the governance issue is well and truly resolved, it can be used for semantic inferencing, enabling the GenAI system to analyze and combine data with the same flexibility that is currently possible with unstructured data. But I think that is years away.

Qx. Anything else you wish to add? 

I am encouraged by the increase attention to governance, and the corresponding growth of maturity in the market in regarding database for analytics, which is not a one-size-fits-all approach, but a “pick the best tool for the job” one. I expect products and product functions to be combined in the coming years, and considerable industry consolidation, but in the end, we may have data systems that can deliver immediate value to untrained users at all levels, enabling smarter and more efficient business operations.

……………………………………………..

Carl Olofson, Research Vice President, IDC.

Carl Olofson has performed research and analysis for IDC since 1997, and manages IDC’s Database Management Software service, as well as supporting the Data Integration Software and Data Streaming Pipelines services. Mr. Olofson’s research involves following sales and technical developments in the structured data management (SDM) software markets. One key market is the database management systems (DBMS) software market, which includes non-schematic database management systems, data lake managers, navigational database management systems, low code database management systems, and memory-optimized shared data managers. Also covered is the database administration and development software market. Mr. Olofson also contributes to Big Data research and provides specialized coverage of Hadoop and other Big Data technologies. Mr. Olofson advises clients on market and technology directions as well as performing supply and demand-side primary research to size, forecast, and segment the database and related software markets.

You may also like...