On the Open Source Valkey Project: One Year Later. Q&A with Madelyn Olson
Q1. In March 2024, you and five other members of the open-source community got together to create Valkey within the Linux Foundation. Valkey was meant to be a community-driven project within the Linux Foundation that is supported by a multitude of organizations. The very first release of Valkey was 7.2.5. Valkey celebrated the 1-Year anniversary of the project earlier this year. What happened in the last year?
It’s been a very exciting year for Valkey! We launched two major releases for the project, Valkey 8.0 and 8.1, which introduced some major performance, stability, and reliability improvements. We also released a second distribution of Valkey, which is the Valkey engine natively packaged with some new advanced features, including JSON and bloom filters as new datatypes, support for vector similarity search, and support for LDAP for enterprise authentication.
Q2. Please tell us more about Valkey 8.1. You recently launched the version last month after having gone through several release candidates earlier in the year.
Valkey 8.1 was a release that primarily contained internal optimizations, with a particular focus on improving performance and efficiency. The headline feature is a complete rewrite of the core data structure that backs both the main key-value datastore as well many of the popular datatypes including the hash, set, and sorted set. This new implementation is up to 20% more memory efficient while also being slightly more performant. In addition, we also have a large numb er of other improvements such as faster TLS negotiations, faster replication, and improvements to many individual commands such as BITOPs. We believe this release has a little something for everyone.
Q3. What are the steps to go from an RC to a GA release? When did you know it was ready to release a GA version?
In this case we mostly worked backwards from when we wanted to launch this release, which was around KubeCon. We wanted about two months of time between when we launched the first release candidate and the GA. We use two months as an estimate for how long it takes us to run additional tests to harden the engine as well as give the community time to try out the system and give us feedback. Valkey is often used in the most demanding part of our users’ applications, so we want to do as much as possible to find potential bugs before the release. That obviously includes writing automated tests, but sometimes you only find a bug when a user tries something we didn’t expect.
Q4. Did Valkey and Redis diverge? If yes how?
One of our goals with the Valkey project is to keep the APIs compatible with the last permissively licensed version of Redis, which is 7.2. This is the foundation that we believe most of our user applications are built on top of, and we don’t intend to change those APIs. We are adding new APIs though! Most of our new functionality has focused around improved observability, such as new data distribution statistics and a log to help identify commands that are using a lot of newtwork bandwidth.
Q5. Is Valkey still an in-memory NoSQL data store? Or else?
I think of Valkey as memory-first NoSQL database. We do use disks for durability in certain circumstances today, but we primarily store data in-memory. Over time I would like to see us find more ways to use SSDs to enhance the cost efficiency of user workloads. We’re still in the early days of designing that functionality though.
Q6. You recently mentioned that Valkey 8.0 is faster thanks to incorporating enhanced multithreading and scalability features. What does it mean in practice?
When we forked the codebase, it was using a simpler threaded architecture for executing commands. It was very simple to reason about, but it wasn’t very scalable to more threads and higher throughput. We built a new architecture that allowed the main thread to act as a coordinator, and offload work to other threads. In 8.0 we added the ability to offload I/O work, so that the main thread can stay busy processing commands while other threads are responsible for handling network I/O. This strategy keeps the behavior more consistent with earlier versions of Valkey.
Q7. In 8.1. you have rebuilt the key-value store from scratch to take better advantage of hash tables also called “Swiss tables”. Why? What are these “Swiss tables” and what benefits do they bring to Valkey?
The Swiss table was designed by Google and it was built to be flexible enough to solve any use case, just like a “Swiss army knife”. The main idea behind the design is that getting data from DRAM is the bottleneck for most typical Valkey commands, and so we want to limit the number of times we are waiting on data from DRAM. CPUs already try to optimize this by keeping some recently used memory in caches inside the CPU. The goal of a Swiss table is to take advantage of how CPUs store data in the CPU caches to both save memory and improve CPU efficiency. We took inspiration from the Swiss table to rewrite the main key-value datastore within Valkey, achieving up to 20% memory reduction and improving performance by up to 10%. You can read about it in more detail in our writeup.
Q8. Is Valkey a high-performance database for a variety of distributed workloads? Do you have any benchmark results to share?
We talk quite a bit about 1 million requests per second (RPS) that is achievable within a single node. We documented the results in this blog. It requires the usage of the I/O threading I talked about earlier, using about 8 cores total to achieve that performance. In addition, Valkey supports horizontal scaling, so you can scale up to about 500 shards safely today, which gives you something like 500 million RPS in a single cluster. We’re actively working on benchmarking workloads with clusters of that size, to see if we can push that number higher.
Q9. What kind of efforts did you put into the project to ensure a smooth transition for existing Redis Users?
If a Redis user is on 7.2 or earlier, the migration story is very simple, since we support loading the replication stream and commands from those older versions. We’ve heard a large number of stories about how it was painless for them to move.
Q10. Is Amazon Web Services using Valkey? For what specifically?
Amazon officially supports Valkey through Amazon ElastiCache for Valkey and Amazon MemoryDB for Valkey, the two managed AWS services that support Valkey. Several internal services and teams use Valkey through these services.
Q11. How can developers join the Valkey community and what should they contribute?
The easiest way to connect is to visit our GitHub and try out Valkey yourself. We love to hear feedback from all sorts of users. If you are interested in developing in the engine, I would recommend reaching out to the folks on our community slack, who can help you find some easy tasks to onboard onto and learn about the codebase. You can find links to all of that here: https://valkey.io/connect/.
Q12. What is up next for the Valkey community?
We are also now scoping and developing for our next major release, Valkey 9.0, which will add a host of new functionality to the engine. The two main user-facing features are adding support for setting a time to live on specific fields within a hash data type, currently the most requested Valkey feature, as well as support for multiple logical databases in Valkey’s distributed mode (Valkey Cluster). We are also working on a new algorithm for rebalancing clusters that is faster, more reliable, and much easier to use. You can also expect us to continue innovating in the performance space. The release will be ready sometime this fall.
………………………………………………..

Madelyn Olson, AWS Engineer and Valkey project maintainer.
Software engineer interested in building complex scalable technologies and solving hard problems for customers. I’m also passionate about helping build and maintain open communities around open-source projects.
Resources
On the Open Source Valkey Project. Q&A with Madelyn Olson, ODBMS.ORG JUNE 5, 2024
Sponsored by AWS