On MariaDB. Q&A with David Thompson.
Q1. What are the main lessons you have learned in your career in the development and scaling out data driven applications?
The biggest thing I’ve learned is the importance of starting small, focusing on your critical requirements first and selecting a technology that will help you scale when you’re ready. Once you’ve built something that is already adding value to stakeholders, from there you can switch your focus to scaling and already have the technology in place to do so.
Q2. You have introduced MariaDB ColumnStore 1.1 What is it and how does it compare with other NoSQL columnar storage engine for large scale analytics in the market?
MariaDB ColumnStore 1.1 is a storage engine optimized for modern analytical workloads – distributed and with parallel query processing for greater scalability, and columnar storage for higher efficiency and query performance. This serves as the foundation for MariaDB AX, our analytical database solution. In comparison to other NoSQL offerings, MariaDB ColumnStore is fairly priced, and offers critical data adapters and connectors that make it significantly easier to publish data into the storage engine to support big data and streaming use cases.
Q3. You have recently announced new product enhancements to MariaDB AX, claiming that these will eliminate the need for traditional data warehouses. Can you please explain how?
Data warehouses are typically expensive and fairly complex to operate. Meanwhile, organizations require increasingly more meaningful and timely analytics that fit within hardware and cost limitations. Built for performance and scalability, MariaDB AX uses a distributed and columnar open source storage engine with parallel query processing that allows customers to store more data and analyze it faster. Rather than relying on traditional data warehouses which pose challenges like requiring proprietary hardware and high operational maintenance costs, MariaDB AX supports a wide range of advanced analytic use cases across every industry, for example, identifying health trends to inform healthcare programs and policy, behavioral analysis to inform customer service and sales strategies, and analysis of financial anomalies to inform fraud programs.
Q4. You were quoted saying that “MariaDB AX, it’s easier than ever to ingest and analyze streaming data from disparate sources, while ensuring the highest level of reliability through new high availability and backup capabilities.” Can you please explain how does this work in practice?
MariaDB’s new programmatic integration and packaged data adapters help our customers more easily take data from disparate sources and plug it into ColumnStore. Additionally, building on Red Hat’s GlusterFS, we also added the ability to automatically take advantage of high data availability without requiring third party network storage. We also added a new tool that helps automate the backup process.
Q5. What are the main benefits of this the new data ingestion capabilities in MariaDB AX?
With the expanded data ingestion capabilities in MariaDB AX, users are able to aggregate and analyze vast amounts of critical data from a variety of sources in near real time. This includes allowing users to capture data from streaming sources and integrating with the industry standard Kafka messaging queue. This frees up users from the time intensive process of manually moving and re-formatting data for analysis, and makes it easier to pull from a variety of other systems.
MariaDB is working on building out streaming data adapters for varying standard ETL tools, starting with Pentaho.
Q6. What other tools do you offer for performing custom and complex analytics?
MariaDB has added the ability to do user defined aggregate functions and user defined window functions. These let the user implement custom functions such as “SUM” over a range of values to produce results. Additionally, we’ve added support for common table expressions (CTEs) which are a more complex way of expressing SQL queries.
Q7. What are bulk data adapters? Can you please tell us how to use them with an example?
Bulk data adapters integrate with C++, Java and Python libraries to enable direct data ingestion. This allows MariaDB AX users to develop applications and services that continuously collect and write large amounts of data for analysis. Our streaming data adapters are built on top of these bulk data adapters.
Additionally, MariaDB created an integration with Spark, using the bulk data adapters to help customers push the results of machine learning data frames directly into ColumnStore.
Q8. What was the motivation for a closer integration between Tableau and MariaDB?
The certification with Tableau was driven by growing customer demand for closer integration between Tableau and MariaDB and support for the combined solutions. The integration brings together MariaDB’s highly popular data management products and Tabelau’s renowned visualization technologies to enable reliable, fast, data-driven business decisions for organizations around the globe that use these two preferred solutions.
MariaDB TX is designed for transactional workloads and MariaDB AX is geared toward analytical workloads. Most customers generally have needs that make one or the other more suitable. TX and AX are both derived from the same MariaDB code and, in a future release, the two solutions will be combined into a single platform, allowing customers to perform mixed workloads.
David Thompson, VP Engineering at MariaDB Corporation.