On Apache Cassandra 4.0 Q&A with Patrick McFadin

One of the topics the project has been debating often is release cadence and the downstream effect. Incremental changes beyond bug fixes are important as new features are improved and matured. Finding a way to be consistent for operators to plan upgrades. Major releases with breaking changes will only be yearly, again to allow operators to plan. 

Q1. Apache Cassandra 4.0 has now entered general release (GA) and is considered production-ready. Why did it take 13 months since the beta was announced?

This was a direct result of the early commitment agreed to by the various contributors of the project. When we finally ship 4.0, it will be ready on day one and used in production by the teams contributing and certifying the release. As the beta was tested in live production environments, some of the most important bugs were found and fixed. Sometimes it was frustrating to watch as we got closer to the final release candidate only to find a bug at the last minute. The project stuck to the principles of what we felt a certified release should look like.  

Q2. What is special about this new release?

I jokingly say that the most important thing about this release is how much more boring Cassandra is now. Cassandra is over 10 years old and has left it’s adolescent past behind to become the mature database it needs to be. There are countless mission-critical workloads running on Cassandra now and there really isn’t a more important thing to have in a database than the trust you have in storing your data. Many of the new features that have been added are to help deploy larger clusters with very little operational overhead. Petabyte size clusters are no longer rare so this is an important milestone for the future of the project. 

Q3. The goal for this release was to be “the most stable Apache Cassandra in history.” How many bugs were fixed? 

There were over one thousand bugs fixed but even more interesting is how they were found. Testing distributed systems is one of the more challenging problems in infrastructure. Unfortunately, up to now, the best testing is done in production with a long time in service. A few teams decided to focus on better testing frameworks and the result was remarkable. There are now several new sub-projects around Cassandra that focus on various types of correctness testing and scale failure injection. The testing outcome was simple. After running every possible horrible scenario that could be dreamed up, the database can’t lose one byte of data. If it does, it has to be fixed. That should give everyone a lot of confidence when deploying.

Q4. One of the new features is change data capture streaming. What is it and what is it useful for?

There are several use cases that require data to be durably committed before proceeding to the next step in the application. When Cassandra is used as the database of record, having it emit the recently committed data is incredibly useful. A good example is alerting systems. If an application is reading data from a Cassandra table, an alert sent earlier in the data processing pipeline creates a potential race condition if the application is reading from that table. Another use case is sending the committed data to a secondary processing system such as search indexing.

Q5. With incremental repair, you have hardened consistency checking between replicas. Why? What applications may benefit from this?

The repair process has been overly complicated for too long in Cassandra and this was a significant push to make it easy and reliable. Anti-entropy repair isn’t an optional maintenance task for a running cluster, however, operators have had to either learn the complicated failure modes or in the worst case stop running the process. We are hopeful that these changes now make the repair process something that “just works” and reduces the operational overhead for end users. 

Q6. You have also added a real-time audit logging. Why? What applications may benefit from this?

Audit logging has been an important need for large enterprises with tighter security requirements and is mandatory for highly regulated industries like financial services. Along with the theme of database maturity, this was an important feature. The addition of auditing will allow Cassandra to be included in more use cases that need a highly scaling distributing database. For those users that don’t need full auditing, we use the same mechanism to create replay logs that can be used for testing. Record your production workload, replay in your test environment. It was a nice feature we could add based on the nature of how auditing taps into the query processor. 

Q7. The Apache Cassandra project is planning to commit to 6-month release cycles. What is your take on this? Why so many release cycles?

One of the topics the project has been debating often is release cadence and the downstream effect. Incremental changes beyond bug fixes are important as new features are improved and matured. Finding a way to be consistent for operators to plan upgrades. Major releases with breaking changes will only be yearly, again to allow operators to plan. 

Q8. The community has also made changes to the development process. Can you tell us a bit more about it?

The biggest change has come in how we introduce new features. In the past, a new feature would either be started as an issue in the Apache Jira or even a pull request into the project. The Cassandra Enhancement Proposal (CEP) has given much more structure to the process by giving new features a place to be discussed and shaped before anyone writes a line of code. There are already several proposals that have been submitted and are on their way to being solidified as a new feature. For the larger community, it will allow for more voices to be heard as we continue to improve Cassandra into the future. 

Q9. What role has DataStax played for this new release?

We were very happy to join in with the larger community to contribute time and resources to the release. There were 217 contributors to the 4.0 release and what great company to have! DataStax is committed to contributing to not only the engineering effort but supporting the community through education and certification.  

Q10 What is the roadmap ahead for The Apache Cassandra project? 

With a really solid baseline to work with, there is a lot of pent-up demand for some newer and cutting-edge features. Looking at some of the existing CEPs under review, there are some great new things coming. One that has already gained a lot of interest is a replacement for the existing secondary indexing. The current version has very limited functionality and performance and this new system will greatly increase the flexibility of existing data models. Another is in creating more choices in how we store data which speaks to one thing that we should see more of in future releases. Increased interoperability and ecosystem integrations with cloud native technologies such as Kubernetes.  

Qx Anything else you wish to add?

Just a huge thank you to everyone that has contributed to Apache Cassandra over the past 10 years. We have a lot to be proud of and the future looks even more exciting. Next stop… 5.0! 

………………………………………………….

Patrick McFadin is the co-author of the upcoming O’Reilly book “Managing Cloud-Native Data on Kubernetes” He currently works at DataStax in Developer Relations and as a contributor to the Apache Cassandra project. Patrick has worked as Chief Evangelist for Apache Cassandra and as a consultant for DataStax, where he had a great time building some of the largest deployments in production. Previous to DataStax, he held positions as Chief Architect, Engineering Lead and Database DBA/Developer.

Sponsored by DataStax

You may also like...