In-memory OLTP database. Interview with Asa Holmstrom.
“Those who claim they can give you both ACID transactions and linearly scalability at the same time are not telling the truth because it is theoretically proven impossible” –Asa Holmstrom.
I heard about a start up called Starcounter. I wanted to know more. I have interviewed the CEO of the company Asa Holmstrom.
RVZ
Q1. You just launched Starcounter 2.0 public beta. What is it? and who can already use it?
Asa Holmstrom: Starcounter is a high performance in-memory OLTP database. We have partners who built applications on top of Starcounter, e.g. AdServe application, retail application. Today Starcounter has 60+ customers using Starcounter in production.
Q2. You define Starcounter as “memory centric”, using a technique you call “VMDBMS”. What is special about VMDBMS?
Asa Holmstrom: VMDBMS integrates the application runtime virtual machine (VM) with the database management system (DBMS). Data only residees in one single place all the time in RAM, no data is transferred back and forth between the database memory and the temporary storage (object heap) of the application. The VMDBMS makes Starcounter significantly faster than other in-memory databases.
Q3. When you say “the VMDBMS makes Starcounter significantly faster than other in-memory databases”, could you please give some specific benchmarking numbers? Which other in-memory databases did you compare with your benchmark?
Asa Holmstrom: In general we are 100 times faster than any other RDBMS, 10 times comes from being IMDBMS, 10 times comes from VMDBMS.
Q4. How do you handle situations when data in RAM is no more available due to hardware failures?
Asa Holmstrom: In Starcounter the data is just as secure as in any disk-centric database. Image files and transaction log are stored on disk, and before a transaction is regarded committed it has been written to the transaction log.
When restarting Starcounter after a crash, a recovery of the database will automatically be done. To guarantee high availability we recommend our customers to have a hot stand-by machine which subscribes on the transaction log.
Q5. Goetz Graefe, HP fellow, commented in an interview (ref.1) that “disks will continue to have a role and economic value where the database also contains history, including cold history such as transactions that affected the account balances, login & logout events, click streams eventually leading to shopping carts, etc.” What is your take on this?
Asa Holmstrom: As we have hardware limitations on RAM databases in practice about 2TB, therefore there will still be a need for database storage on disk.
Q6. You claim to achieve high performance and consistent data. Do you have any benchmark results to sustain such a claim?
Asa Holmstrom: Yes, we have made internal benchmark tests to compare performance while keeping data consistent.
Q7: Do you have some results of your benchmark tests publically available? If yes, could you please summarize here the main results?
Asa Holmstrom: As TPC tests are not applicable to us, we have done some internal tests. We can’t share them with you.
Q8. What kind of consistency do you implement?
Asa Holmstrom: We support true ACID consistency, implemented using snapshot isolation and fake writes, in a similar way as Oracle.
Q9. How do you achieve scalability?
Asa Holmstrom: ACID transactions are not scalable. All parallel ACID transactions need to be synchronized and the closer the transactions are executed in space, the faster the synchronization becomes. Therefore you get best performance by executing all ACID transactions on one machine. We call it to scale in. When it comes to storage, you scale up a Starcounter database by adding more RAM.
Q10: For which class of applications it is realistic to expect to execute all ACID transactions on one machine?
Asa Holmstrom: For all applications when you want high transactional throughput. When you have parallell ACID transactions you need to synchronize these transactions, and this synchorinization becomes harder when you scale out to several different machines. The benefits of scaling out grows linearly with the number of machines, but the cost of synchronization grows quadratically. Consequently you do not gain anything by scaling out. In fact, you get better total transactional throughput by running all transaction in RAM on one machine, which we call to “scale in”. No other databas can give you the same total ACID transactional throughput as Starcounter. Those who claim they can give you both ACID transactions and linearly scalability at the same time are not telling the truth because it is theoretically proven impossible. Databases which can give you ACID transaction or linearly scalablity
cannot give you both these things at the same time.
Q11. How do you define queries and updates?
Asa Holmstrom: We distinguish between read-only transactions and read-write transactions. You can only write (insert/update) database data using a read-write transaction.
Q12. Are you able to handle Big Data analytics with your system?
Asa Holmstrom: Starcounter is optimized for transactional processing, not for analytical processing.
Q13. How does Starcounter differs from other in-memory databases, such as for example SAP HANA, and McObject?
Asa Holmstrom: In general the primary differentiator between Starcounter and any other in-memory database is the VMDMBS. SAP HANA has primarily an OLAP focus.
Q14. From a user perspective, what is the value proposition of having a VMDMBS as a database engine?
Asa Holmstrom: Uncompetable ACID transactional performance.
Q15. How do you differentiate with respect to VoltDB?
Asa Holmstrom: Better ACID transactional performance. VoltDB gives you either ACID transactions (on one machine) or the possibility to scale out without any guarantees of global database consistency (no ACID). Starcounter has a native .Net object interface which makes it easy to use from any .Net language.
Q16. Is Starcounter 2.0 open source? If not, do you have any plan to make it open source?
Asa Holmstrom: We do not have any current plans of making Starcounter open source.
——————
CEO Asa Holmstrom brings to her role at Starcounter more than 20 years of executive leadership in the IT industry. Previously, she served as the President of Columbitech, where she successfully established its operations in the U.S. Prior to Colmbitech, Asa was CEO of Kvadrat, a technology consultancy firm. Asa also spent time as a management consultant, focusing on sales, business development and leadership within global technology companies such as Ericsson and Siemens. She holds a bachelor’s degree in mathematics and computer science from Stockholm University.
Related Posts
– Interview with Mike Stonebraker. by Roberto V. Zicari on May 2, 2012
Resources
– Cloud Data Stores – Lecture Notes: Data Management in the Cloud.
by Michael Grossniklaus, David Maier, Portland State University.
Course Description: “Cloud computing has recently seen a lot of attention from research and industry for applications that can be parallelized on shared-nothing architectures and have a need for elastic scalability. As a consequence, new data management requirements have emerged with multiple solutions to address them. This course will look at the principles behind data management in the cloud as well as discuss actual cloud data management systems that are currently in use or being developed.
The topics covered in the course range from novel data processing paradigms (MapReduce, Scope, DryadLINQ), to commercial cloud data management platforms (Google BigTable, Microsoft Azure, Amazon S3 and Dynamo, Yahoo PNUTS) and open-source NoSQL databases (Cassandra, MongoDB, Neo4J).
The world of cloud data management is currently very diverse and heterogeneous. Therefore, our course will also report on efforts to classify, compare and benchmark the various approaches and systems.
Students in this course will gain broad knowledge about the current state of the art in cloud data management and, through a course project, practical experience with a specific system.”
Lecture Notes | Intermediate/Advanced | English | DOWNLOAD ~280 slides (PDF)| 2011-12|
##
I’m one of the VoltDB devs and would like to correct a few things. First, VoltDB is 100% ACID across a cluster. There is no non-ACID mode in VoltDB and the only consistency level supported is global serializable consistency.
As for scaling ACID across machines, whether it’s possible depends on the workload. Partitionable workloads scale perfectly. Non-partitionable workloads don’t. Workloads that are somewhere in the middle can still benefit from multiple machines. The vast majority of OLTP use cases we see are either partitionable (by customer, ticker symbol, ad-netowork, etc) or mostly partitionable.
Finally, building for a cluster allows for a level of availability that can’t be achieved with “scale-in”. VoltDB can run millions of TPS across a synchronously replicated, redundant cluster.
Thank you John. RVZ
If I remember correctly, VoltDB requires to partition database per CPU core for performance reason. Starcounter processes concurrent transactions against a single database and transparently utilizing all cores.
John, can you clarify your statement? Does VoltDB maintain consistency over all cluster nodes and execute simultaneous read and update transactions at normal operational rate against the same data on all the nodes with the claimed performance? What do you mean by a cluster? Is it shared nothing, shared disk, shared memory, or?
I am a Starcounter developer.
Hi Rusian,
Yes, VoltDB maintains consistency across all cluster nodes. Yes, we mix reads and writes at that rate, but with serializable consistency. Our clusters are all shared nothing.
Yes VoltDB asks users to tell the system how to split up their data across CPUs and cluster nodes. How well your data partitions will affect performance; some workloads that partition poorly may be better suited for scale-up systems or relaxed consistency. For most workloads, VoltDB achieves dramatic price/operation reductions and offers a clear scalability path for growing workloads.
Ruslan,
Apologies for spelling your name wrong above.
Hi John,
Thank you for clarification. Does VoltDB performance of millions of TPS correspond to well partitioned database?
Starcounter does not require to partition databases. To our experience there are many applications, where databases cannot be partitioned or it is difficult, and such applications still require good performance. Thus Starcounter is designed to get the best performance by running on a single machine with multi-core CPU and large amount of memory.