On MariaDB Xpand. Q&A with Curt Kolovson

“If Xpand does not demonstrate better throughput and lower latency, MariaDB will donate $25K to either a nonprofit or to offset the infrastructure costs associated with running the test.”

Q1. What is MariaDB Xpand, and how does it fit into MariaDB’s product offerings? 

Xpand is a distributed SQL database that excels at both OLTP and OLAP workloads, as well as combined OLTP/OLAP, i.e., so-called hybrid transaction/analytical processing (HTAP) workloads. Xpand is MariaDB’s high-performance, distributed scale-out database offering. The latest release, Xpand 6, includes general availability of two major features, parallel replication and columnar indexes.

Q2. What is the reason why modern software is increasingly Serverless?

Serverless computing is growing in popularity primarily as a cost-effective and easy way to support test/dev use cases. Most cloud services require some configuration and setup tasks, as well as choosing the size of a server or selecting bandwidth requirements for networking or storage—and they are generally expensive for test/dev type activities. In the case of serverless, none of that is necessary. A user simply has some program or piece of software that they want to run, and serverless lets them do it at low cost and with ease of use. Serverless infrastructure need not be highly performant or highly available; it simply needs to work for a short period of time. 

Q3. Xpand is based on what you call “Distributed Continuation.” What is it and what are the key benefits? 

“Distributed Continuation” is a phrase that we coined to convey that Xpand is implemented using the continuation-passing style (CPS). This allows for very high levels of concurrent execution. In addition to using CPS, Xpand also uses a distributed dataflow execution engine that executes code fragments to completion, which enables high throughput. The combination of these two methods plays a large part in Xpand’s industry-leading performance and remarkable efficiency. 

Q4. Technically speaking, in Xpand, partial aggregations run in parallel on multiple nodes, and within each, on multiple cores. What does it mean for a developer of a mission critical application?

What this means for a developer is that aggregations, as well as all aspects of query execution, are performed in a highly parallelized and efficient manner. This stands in contrast to other database systems that were originally designed in a non-distributed way and were later retrofitted to work in a distributed environment. In the case of Xpand, it was designed from the ground up as a distributed SQL database.

Q5. What are your practical suggestions on how to deal with secondary indexes?

Secondary indexes are useful for supporting queries that may access data according to certain attributes other than through the primary key(s). The use of secondary indexes should not be overdone, however. Unused or seldom-used secondary indexes incur overhead during create, update, or delete operations. Therefore, secondary indexes should be used sparingly—but certainly when the overhead of maintaining them is outweighed by the benefit they offer to query performance.   

Q6. How easy is it to manage data in Xpand?

It’s no more difficult than in any SQL database, except that some thought should be given to how data should be distributed. This is not a difficult task, and is even easier with Xpand in MariaDB SkySQL, our fully managed database-as-a-service, which has excellent observability and monitoring tools.  

Q7. What are the typical applications that would benefit most when developers use Xpand?

Xpand is particularly well suited for large-scale database workloads that exceed the capacity of a single-instance database, i.e., a database that runs on a single system. Xpand is a highly scalable distributed database that can support a large number of concurrent clients and challenging workloads— such as combined OLTP/OLAP with high throughput rates—with any read/write ratio and any access pattern (including highly random). Xpand is unique in this regard because of its inherent efficiency and scalability. 

Q8. Anything else you wish to add?

We recently ran a Sysbench looking at Xpand performance compared with another distributed SQL database, CockroachDB, on each vendor’s respective cloud offerings. We found that Xpand greatly outscaled and outperformed CockroachDB, with Xpand reaching 8x, and up to 10x, better throughput with lower latency. Based on this data, we are confident that Xpand will beat any distributed SQL solution and have invited organizations who are evaluating a distributed SQL option to put Xpand to the test. If Xpand does not demonstrate better throughput and lower latency, MariaDB will donate $25K to either a nonprofit or to offset the infrastructure costs associated with running the test. If you have a use case that’s well suited for distributed SQL, taking our $25K challenge is a great way to be confident that you’re choosing the best performing database for the job.

………………………………..

Curt Kolovson has over 40 years of industry experience with database management systems (DBMS). He has worked at VMware, Bell Labs, Hewlett-Packard Labs and HP product divisions, and he’s currently a senior principal software engineer at MariaDB Corporation. Kolovson has both contributed to database research and been a practitioner developing and using DBMS technology. His areas of expertise include storage engines, indexing techniques, transactions, logging and recovery, high availability, disaster recovery, replication, spatial databases including Geographic Information Systems (GIS), temporal (historical) databases such as time-series databases, object-oriented DBMS, performance tuning and troubleshooting. He was part of the original team that developed Postgres in the mid- to late-1980s at UC Berkeley, under the leadership of Prof. Michael R. Stonebraker, who won the ACM Turing Award in 2014. Kolovson holds an MS and PhD in Computer Science from UC Berkeley; Prof. Stonebraker was his PhD advisor. Kolovson is currently working on the Xpand project at MariaDB, a scale out distributed SQL database. 

Sponsored by MariaDB.

You may also like...