Comments on: MySQL-State of the Union. Interview with Tomas Ulin.

By: Dave Segleau

Dave Segleau — Fri, 22 Feb 2013 18:48:41 +0000

Robert,

The Oracle NoSQL Database and MySQL are complementary technologies and enable Oracle to offer a complete solution stack to its customers. For web applications well served by a relational database, users can rely on MySQL, with the option to use the NoSQL access to MySQL via memcached to speed up key value read and write operations. For applications primarily handling large amounts of horizontally distributed unstructured and/or evolving data, Oracle NoSQL Database is the way to go. Oracle NoSQL Database can be used standalone or in conjunction with Hadoop and the Oracle Database for complex queries.

Regards,

Dave

By: Tomas Ulin

Tomas Ulin — Tue, 19 Feb 2013 09:55:12 +0000

Christos, a MySQL setup that is not partitioned in some way (i.e. all data is written to all MySQL servers), will eventually not scale with further updates.

This is however shared with any architecture that fully replicates the data, irrespective if you have a “single master”, or “several masters”, to write to. I.e. the scalability issue does not have to do with having a “single master”, you will be able to get the same amount of data through a cluster that has a single master, as you would one with multiple masters, as in the end, all nodes must be able to sustain the same number of updates.

What we have done in 5.6 is to allow you to scale your master (and slaves) even further as to push the limit further. When you eventually hit the limit, you need to start to partition your data somehow. This is how for example MySQL Cluster achieves linear scalability, it internally partitions the data transparently to the user. For regular MySQL master-slave replication we currently do not provide any off-the-shelf solution to partition your data transparently. This is something you either have to build into your application logic, or build a middle ware to handle for you, as several companies have successfully done. And some of these implementations are available as open source solutions.

BR, Tomas

By: Robert Greene

Robert Greene — Mon, 18 Feb 2013 03:50:52 +0000

Thomas, I wonder if you have any comments regarding the role of MySQL and the direction it is headed as compared to the Oracle NoSQL Database initiative which is building on scale-out key:value distribution and reliable replication services – which seems to overlap considerable with some of the messaging on MySQL.

Also – somewhat of a side note, but I think its interesting that both Oracle’s NoSQL Database and MySQL are offering an API specifically integrated for Node.js. While I am not a huge fan of JavaScript, its penetration is undeniable and what the Node.js folks have done may prove to be transformative for the software industry. It may very well move the web space to distributed objects and eliminate the pain of XML/JSON/BSON, etc. Giving that server technology seamless database access is a smart move.

-Robert

By: Gagan Mehra

Gagan Mehra — Sat, 16 Feb 2013 20:42:15 +0000

There is a solution to these problems by architecting the big
data solution in a different way.We have several customers using our in-memory
product to hold terabytes of data in memory for fast access. For reads, the data
can be bulk-loaded from the data source (like MySQL) and accessed in micro-seconds
within the application without any scale limitations. For writes, the data
is setup to be eventually consistent by doing write behinds every few hours. In
instances where strong consistency is required, the data is either micro-batched
or uses a variety of locking configurations to optimize performance. Happy to share
more details or discuss specific use cases.

By: Christos Kalantzis

Christos Kalantzis — Fri, 15 Feb 2013 17:44:49 +0000

Hi Tomas,

Thanks for replying. I do agree that replication will help scale out a MySQL installation for reads. I have used this strategy successfully and recommend it regularly. However you quickly run into a write scale issue on a single Master node. When your writes per second go beyond what (mostly) the hardware and (likely) what the software can handle, that is where a single RDBMS (not just MySQL) node fails to support an Internet based application.

I also do agree that MySQL Cluster’s 255 node limit should handle all but the most edge of cases. Zillow has been using MySQL Cluster successfully for a while now. However, as companies move more and more to the cloud, they realize that solutions that depend on physical hardware’s IOPS and network speeds, don’t translate very well in a virtualized environment, that has both limited IOPS and network throughput. Applications need to adjust to that reality and so eventual consistent solutions are being used more and more, ehich can excel in less powerful environments.

Furthermore, there are other limitations to Cluster, which you even point out here : http://dev.mysql.com/doc/mysql-cluster-excerpt/5.1/en/mysql-cluster-limitations.html

I’m a fan of MySQL, I even hosted a podcast for a while called the MySQLGuy Podcast. My goal is not to diminish the great work your team is doing, but to set the appropriate expectations for any users reading Roberto’s article.

Cheers,
Christos

By: Tomas Ulin

Tomas Ulin — Thu, 14 Feb 2013 10:32:41 +0000

Thanks for taking the time to read and comment on the article. We are very excited about the new MySQL 5.6 release, and the trajectory of MySQL Cluster adoption

To answer your specific points, it is correct that scaling up will only get a user so far, but in a little over 4 years, we have scaled MySQL from 4 cores (at best) to 48 cores – this takes users an awful long way, for both internal and internet-based applications.

MySQL allows you to add replication slaves online, so you can scale out your database on commodity hardware on -premise or in the cloud – and you can choose between Asynchronous, Semi-synchronous, and Synchronous replication depending on you availability and consistency requirements.

To enumerate specific scalability enhancements in MySQL 5.6
– Innodb enhancements – allows you to scale your MySQL system further giving you 2x more performance

– Optimizer enhancements – allows anything from 2x to 280x faster query performance

– Memcached api – allows you to optimize performance critical parts of your application without sacrificing consistency, and retaining the flexibility and rapid development that SQL enables – 10x higher performance

– Replication enhancements: Multithreaded slave – allows you to scale your MySQL system further giving you 5x more performance

– Replication enhancements: Binlog group commit – allows you to scale your MySQL system further giving you 4x more performance

With respect to your comments on MySQL Cluster
– Hardware: The indexes are partitioned just like the rest of the data – each data node just stores its share which enables scale-out across commodity hardware
– Scale: Up to 255 nodes are currently supported by MySQL Cluster. The important factor is node-performance. In recent benchmark tests conducted with Intel (1), a 30 node Cluster delivered just under 20m UPDATE operations per second (1.2 Billion per minute). We believe that MySQL Cluster scale is not inhibited by node count

In summary, MySQL has been selected by many of the world’s largest and most demanding internet-scale services – for example, Facebook, YouTube, and Twitter all run on MySQL. Equally important is that some of the newest and fastest growing services have also selected MySQL, despite having a myriad of other options to choose from, ie Tumblr, Pinterest, Box, Quora, etc.

With MySQL 5.6 and MySQL Cluster we are pushing the boundaries further. With future releases we will continue to do even more of this

1. http://www.mysql.com/why-mysql/benchmarks/mysql-cluster/

By: Christos Kalantzis

Christos Kalantzis — Wed, 13 Feb 2013 17:19:15 +0000

On MySQL5.6 scaling up, this is only feasible for internal applications, which you don’t want to spend the effort to refactor your code. Any Internet application that needs to support 10s of millions of users, needs a scale-out strategy.

On MySQL Cluster 7.x, this is a huge step forward. However there are still hardware and scale limitations. It is a stop gap solution to properly scaling an application to 10s of millions of users.
-Hardware: Every node needs to hold at least all the Index of the dataset in RAM. This makes using commodity hardware or virtualized hardware impractical.
-Scale: There is an upper limit to how many nodes your cluster can have. So once you hit that limit, you are back to scaling up.

I tackled the complexities of scaling a MySQL solution in my previous company. If your use case fits the limitations, these are great products. If you are going to be the next Big Data “thing”, these are both non-starters.