On SingleStore Kai for MongoDB . Q&A with Jason Thorsness.
Q1. MongoDB is one of the most popular, widely adopted databases. What are the pros and cons of using MongoDB?
MongoDB has grown to be one of the most widely-adopted NoSQL databases for JSON-like documents across many industries. MongoDB’s document model and API align closely with how many applications produce and consume data, and its developer experience and broad ecosystem of supporting libraries and integrations make it a natural choice for many use cases. However, MongoDB’s underlying architecture is not optimized for in-app analytics or any query involving large numbers of documents in collections, nor does it have great support for fast vector operations used for AI scenarios, and it has only limited support for integration with SQL-based clients and tools.
Q2. You have recently launched SingleStore Kai for MongoDB, a MongoDB API to Power Real-Time Analytics on JSON. Can you please tell us a bit about it?
SingleStore Kai ™ for MongoDB enables applications written for MongoDB to achieve 100x faster in-app analytics on their data without requiring any code changes or data transformations. The new API is MongoDB wire protocol compatible and enables developers to power their interactive applications with all the high-performance features of SingleStoreDB. This API helps solve many of the limitations of MongoDB and opens up new possibilities such as high-performance full collection aggregations for in-app analytics experiences and vector-based search (image matching, etc.) for MongoDB-based applications.
Q3. Is there any need for any query changes or data transformations?
At launch, SingleStore Kai ™ for MongoDB implements all of the MongoDB features most applications will use, and data is stored by default in SingleStoreDB’s JSON storage type. We expect most applications will not require any query changes or data transformation to use the service successfully. That said, it’s possible some applications might use a command not implemented yet, or rely on a nuance of the BSON type system not well reflected in JSON. We have a series of updates planned over the next few months towards ever increasing compatibility (targeting the MongoDB 6.0 server version) and full end-to-end BSON support for storage. We’ll use all feedback we receive to help prioritize these investments, so please send it our way!
Q4. Why is JSON so important?
JSON, for all its limitations and specification ambiguity, has become the most widespread format for semi-structured data. Everything, including humans, can parse and read JSON. Even ChatGPT understands the format and concepts of JSON well enough to return responses in JSON. It’s everywhere. However, many of the databases designed with JSON first have not been able to provide the sort of in-app analytics available in the classic SQL systems, leaving a huge amount of the modern world’s data unable to be analyzed in real-time. At SingleStore we decided to bridge the gap by making our high-performance feature set available to all MongoDB applications.
Q5. Do you have any benchmark that indicates that you get faster analytics on JSON data within existing MongoDB applications ?
Yes, at launch our benchmarking suggests SingleStore Kai ™ enables a large speedup, 100x to even over 1000x, on many MongoDB queries. For example, some queries such as a $group taking the count, sum, and average over a few fields after a $match can complete in milliseconds over a hundred million documents, which can take over a minute with native MongoDB. We did extensive benchmarks and testing of performance – both internal testing and industry benchmarks to validate. For example we performed the ClickBench analytics performance benchmark where 43 queries are executed over a data set of about 100 million records. We found that SingleStore Kai ™ delivered superior performance compared to native MongoDB including
>80% of the queries completed in less than half the time.
>50% of the queries were at least 43 times faster
>30% of the queries were at least 100 times faster
You can learn more about the benchmarks and results here
This opens up completely new possibilities for applications to perform in-app real-time analytics queries to provide experiences that were not possible before without complex pre-aggregation or estimation. Not all query shapes are optimized yet and we are continuing to add more over the coming months as well.
Q6. What does it mean in practice that the new API is MongoDB wire protocol compatible?
We have our own implementation of the network protocol used by MongoDB 6.0. This means that to a client application, our service appears to be a MongoDB server. Developers who are familiar with the MongoDB and have client applications written using the normal MongoDB client drivers and libraries can utilize SingleStore Kai ™ for MongoDB simply by changing their connection string, allowing them to get the advantages of SingleStore DB while continuing to leverage the MongoDB tools, drivers, skill sets and ecosystem they are most familiar with.
Q7. How is it possible that developers can write applications with analytics with SingleStoreDB using the same MongoDB commands?
Our service translates MongoDB commands into the language used by the SingleStoreDB engine, and translates responses back from the native SingleStoreDB response format to the MongoDB response format. As long as the service has implemented the translation, the same command will perform both against native MongoDB and SingleStore Kai for MongoDB. Now, the interoperability between SQL and MongoDB is really compelling because as a developer, you do not have to choose between SQL or NoSQL (MongoDB query language) — you can now effectively utilize both languages within a single engine to power your applications.
Q8. Does SingleStoreDB augment MongoDB or replace it?
Developers can use SingleStore Kai ™ for MongoDB in different ways. They can switch their application to SingleStoreDB completely and replace native MongoDB, which in some cases can simplify and improve their application. Or, they can augment their MongoDB instance with SingleStore Kai In this augmentation model, MongoDB continues to be the operational database, while SingleStoreDB becomes the analytical engine. Collections of JSON that require analytics are replicated into SingleStoreDB and the application runs ultra-fast analytical queries against SingleStoreDB. The key difference here compred to other relational databases is that with SingleStore Kai no changes to your MongoDB queries, data flattening or transformations are required. For that latter case we have a data replication and CDC feature,currently in private preview, that effortlessly synchronizes collections in SingleStoreDB from native MongoDB’s change streams – so that transactional queries can be performed against a primary MongoDB instance, and with very low latency the data can be replicated into a SingleStoreDB instance to support high-performance analytics queries.
Q9. You mentioned that “SingleStore is already MySQL wire protocol compatible and now with the addition of SingleStore Kai for MongoDB, developers can essentially get the best of both worlds”.
How does it work in practice?
Some applications are just simpler with a document database model, rather than over a SQL table-based model. However, SQL’s tables have a number of advantages related to schema and query language and capabilities. SingleStore Kai ™ for MongoDB provides a mapping between both approaches enabling the developers to take the best of both. For example, you can set it up so the data is inserted and read through SingleStore Kai ™ for MongoDB as documents. From the SQL API this data can be queried as a table, optionally with top-level columns populated from top-level fields in the BSON document. You can also use SQL features directly from within your MongoDB queries when the existing MongoDB API support does not provide enough functionality.
Q10. How is it possible to test this new product?
You can join our public preview as part of the SingleStoreDB Cloud offering.
Q11. Anything you wish to add?
One of the things I’m most excited about is the introduction of vector based similarity searches for MongoDB based AI applications. Especially as we continue to see functions like ChatGPT, and at SingleStore, our own AI bot SQrL (“squirrel”) gain prominence and assist us with a variety of workloads, vector functionality is a key feature for AI innovation. We’ve exposed these as $dotProduct and $euclideanDistance aggregation expressions in our API and I’m excited to see some of the things customers build on top of them.
Principal Software Engineer, SingleStore
Sponsored by SingleStore.