Interview with John Partridge, President & CEO of Tokutek, Inc.
“As the database gets used, shards can grow at an uneven rate and one shard might carry a majority of the load. MongoDB corrects this by balancing shards, but because of MongoDB’s lack of concurrency this operation can stall the database unacceptably.”–John Partridge.
I have interviewed John Partridge, President & CEO of Tokutek, Inc.
Q1. Tokutek recently announced to have eliminated performance issues of MongoDB sharding. What was the problem?
John Partridge: The problem occurs after a shard is created. As the database gets used, shards can grow at an uneven rate and one shard might carry a majority of the load. MongoDB corrects this by balancing shards, but because of MongoDB’s lack of concurrency this operation can stall the database unacceptably (see the benchmark).
Q2. For what kind of application users of MongoDB experienced these bottlenecks?
John Partridge: Users who need to scale out, and rely on sharding to do so.
Q3. What is the solution you propose to this problem?
Q4. How TokuMX v1.4 is able to allow shards to be balanced and added without disruption for a NoSQL solution that scales up and scales out?
John Partridge: TokuMX replaces the B-tree indexing used in MongoDB with patented Fractal Tree indexing, which allows for significantly better concurrency (among other things). Because of the improved concurrency, data can be copied, then deleted, from one shard to another without unnecessary locking.
Q5. What is the difference in performance of your solution with respect to the basic MongoDB? What “basic” MongoDB do you use for this comparison?
John Partridge: “Basic” MongoDB is the distro that you get from MongoDB (10gen). We typically see 20x performance improvements but as you might imagine, it depends on the application. Because TokuMX offers document-level locking rather than the database-level locking, TokuMX shines when there are significant reads *and* writes.
Q6. How do you compare TokuMX with other distribution of MongoDB, such as the one of 10gen (now MongoDB)?
John Partridge: There are three major differences: 20x performance improvement, 90% smaller database size (we compress the data), and support for ACID transactions. Look at the bottom of http://www.tokutek.com/products/tokumx-for-mongodb/ for more information on each of these benefits.
Mr. Partridge brings over twenty years of experience in the software industry as a developer, investor, and entrepreneur. He joins Tokutek from StreamBase Systems which John co-founded with database pioneer Dr. Michael Stonebraker. He started his career as a software developer at Microsoft Corporation where he co-authored Excel v1.0. He later worked as a venture capitalist at Accel Partners and the Summit Accelerator Fund where he specialized in investing in early stage internet infrastructure and enterprise software companies. John holds an A.B. in Applied Mathematics / Computer Science from Harvard University and an MBA from the Stanford University Graduate School of Business.
– TokuMX vs. MongoDB : Sharding Balancer Performance Posted on February 16, 2014 by Tim Callaghan, Tokutek.
– What’s new in TokuMX 1.4, Part 4: Smaller, faster sharded clusters. Posted on February 20, 2014 by Leif Walsh, Tokutek.
Follow ODBMS.org on Twitter: @odbmsorg