On in-memory, key-value data stores. Ofer Bengal and Yiftach Shoolman
“While modernizing legacy applications used to be a key reason for deploying in-memory, key-value data stores, we see that this is changing. New applications particularly those that are highly interactive need to bring a user experience that is very responsive under all conditions. For such new applications, an in-memory datastore, particularly one that can simplify run time analytics like counting, scoring, managing lists and sets, is becoming a key ingredient for low latency responses and high throughput.” –Ofer Bengal.
I have interviewed Ofer Bengal, Co-Founder and CEO of Redis Labs, and Yiftach Shoolman, Co-Founder and CTO of Redis Labs.
Main topics of the interview are: How is the database market evolving, proprietary vs. open source software, in-memory/ key-value data stores, and the new features of Redis.
Q1. How do you see the database market evolving?
Ofer Bengal, Yiftach Shoolman: The main trends we identify today and believe will continue in upcoming years are:
1) Non-relational databases will continue to see growing adoption, because the schema framework is ineffective when it comes to unstructured data, change in data patterns, growing data volumes, more stringent performance requirements and the way modern apps are built.
2) Multiple database models as opposed to the absolute dominance of RDMS in the past few decades, each model solving the requirements of certain use cases.
Moreover, certain modern databases can run several database models (document, graph, etc.)
3) Multiple databases (different types or the same type) serving the same app. Modern applications are based on micro service architecture, in which each micro service works with the best database for its use case.
This creates new challenges for modern databases: (a) Instant provisioning – sometime hundreds or thousands of databases are provisioned within a second, and (b) Multi-tenancy, otherwise the cost associated with managing database infrastructure becomes extremely high.
4) Database-as-a-service is growing vs. self deployed and operated databases. With enterprises gradually moving to the cloud and having to deal with multiple type databases, it makes a lot of sense to outsource deployment and ongoing operations rather than building in-house practice of DBAs and Devops.
5) Hybrid transactional and analytical processing (HTAP). Driven by the need for application analytics to drive business decision making in real time, certain modern databases can handle those two different workloads simultaneously, eliminating the need for exporting transactional data to a separate dedicated analytical database.
Q2. Proprietary vs. open source software: what are the pros and cons?
Ofer Bengal, Yiftach Shoolman: From the community perspective, open source is great. If there is a vibrant community, it pushes innovation, problem solving and compatibility issues with different environments.
From users perspective, open source is “open”, accessible, can be used by anyone, transparent, and free of charge.
It often comes with less of a danger of vendor lock-in. It is very suitable for independent developers and startups. However enterprises using open source products may have certain challenges:
1. The product is not always suitable for enterprise workloads, especially when it comes to databases. Capabilities like infinite seamless scaling, high-availability with instant failover and stable performance at scale are not always the open source developer’s top priority.
2. Commercial support must be obtained and this typically comes with a price tag which is not much different than acquiring a commercial database product.
3. Commercial support is typically provided by a single company (most probably founded by the open source creators), which creates “vendor lock-in” by itself.
4. In the case of databases, using database-as-a-service may turn out to be lower in cost compared to provisioning cloud instances and running zero cost open source software on them, because commercial can be based on efficient multi-tenant architecture.
Q3. What is the current market for in-memory, key-value data stores?
Ofer Bengal: In-memory key-value data stores (sometimes called in-memory data grids (IMDGs)) have been around since more than a decade and have proven capable of supporting digital business needs for responsive, always-on user experience; real-time, actionable insights; and dynamic scaling. They are widely employed when you want to scale/modernize legacy applications without spending additional money on extremely expensive RDBMS licenses and hardware.This is achieved by providing a scalable and reliable in-memory datastore that enables low-latency transactional and analytical processing.
While modernizing legacy applications used to be a key reason for deploying in-memory, key-value data stores, we see that this is changing. New applications particularly those that are highly interactive need to bring a user experience that is very responsive under all conditions. For such new applications, an in-memory datastore, particularly one that can simplify run time analytics like counting, scoring, managing lists and sets, is becoming a key ingredient for low latency responses and high throughput.
From a Redis perspective, our innovation in data structures brings about the ability to simplify development to the extent that now most Redis users use it as a first responder and primary datastore for substantial pieces of their data. Furthermore with Redis’ data-structures, users can run operational and analytical use cases on the same database.
In addition, acceleration of other in-memory platforms like Spark is possible with Redis.
Gartner estimates that, in 2015, the stand-alone IMDG market was worth approximately $600 million, having grown by about 30% from the previous year. Gartner expects the market to continue to grow in the double-digit range through 2020 and to exceed $1 billion by 2018. Redis, one of the leaders in this space, grew in just a few years to be one of the most popular databases used by developers and enterprises.
Q4. Amazon ElastiCache supports two open-source in-memory engines: Redis and Memcached. What does it mean in practice?
Yiftach Shoolman: In practice, Amazon ElastiCache is a simple caching service that simplifies a developer experience by providing these two open source in-memory engines. Legacy applications that use simple cache can use ElastiCache seamlessly.
However, ElastiCache is single-tenant, limited to caching use cases and cannot be used as a database, lacking enterprise-grade functionalities such as infinite seamless scalability, instant failover and predictable performance.
The Redis Labs equivalent service, called Redis Cloud provides all the benefits of an enterprise-class Redis.
Q5. What are the pros and cons of Memcached and Redis?
Yiftach Shoolman: Redis can be thought of as modern database while memcached is older technology designed specifically for ephemeral caching.
The most important difference is in persistence and HA – memcached is not persistent nor HA, while Redis can operate as a full-fledged in-memory database, highly available through both in-memory replication and data persistence. This reflects the fact that caches in older architectures were not required to be highly available, but in modern architectures, built for scale and volume, cache outages can significantly impact the business and user experience.
Redis, the newer and more versatile technology allows individual data elements to be manipulated while memcached often incurs serialization/deserialization overheads that makes the entire application processing much slower. This is because Memcached can handle only simple key value use cases, whereas Redis offers many more data structures (hashes, sets, sorted sets, lists, hyperloglog..) that simplify complex data processing, analysis and operational use cases with ease.
Even when used as a cache, Redis has more sophisticated eviction policies which can be both active or passive while memcached has only a simple LRU and lazy eviction.
Redis and Memcached are both very popular open source projects, but given its richer functionality, more advanced design, many potential uses, and greater cost efficiency at scale, Redis should be your first choice in nearly every case.
Q6. For very large data sets or analytics workloads, running everything in-memory might not be cost effective. What is your take on this?
Ofer Bengal, Yiftach Shoolman: For very large data sets or analytics workloads, it is advantageous to utilize alternative memory technologies(such as Flash memory, which is a tenth of the cost), as extensions of memory rather than impose a disk access penalty. We have extended enterprise Redis in this manner to take advantage of Flash memory, while using a tiered approach (keys and hot values are still in the fastest memory, while cold values are in “slower” Flash memory) to ensure that you still see sub-millisecond latencies with millions of ops/sec throughput.
Q7. Redis was created by Salvatore Sanfilippo in 2009. What is his role today?
Ofer Bengal: Salvatore is leading the development of open source Redis within Redis Labs. He works with a group of experienced developers on extending the capabilities of Redis. A good example of this collaborative works is the recent introduction of Redis Modules, which extend Redis to a variety of new modern use cases. Salvatore wrote the API and the other team members in a very short time created and tested a few modules, such as Redisearch (a full-text search engine) and Redis-ML (enhancing the performance of Spark machine learning capabilities). Salvatore’s role is to continue the community innovation around the Redis core, together with his team of Redis Labs developers.
Q8. What are the differences of Redis Labs` version of Redis with the original one developed in 2009?
Yiftach Shoolman: Redis Labs fully supports the open source Redis versions, but enhances them with a container-like layer that adds a proxy, cluster management and a shared nothing architecture. Taken together, Redis Labs provides a solid enterprise foundation to Redis, allowing it to scale seamlessly in memory across many hundreds of servers with the high availability through persistence, in-memory cross-rack/zone/region/datacenter replication and instant automatic failover. No retooling or re-architecting is required to move from open source Redis to enterprise Redis, the process is basically effortless and immediate. Redis Labs also offers various database modules, like a RediSearch, multiple probabilistic modules like Bloom Filter, TopK, CMS, Redis-ML for Machine Learning, Redis-TS for Time Series processing, JSON and Graph support.
Q9. What are the possible scenarios of using Redis for data analytics?
Ofer Bengal, Yiftach Shoolman: Redis data structures come with built-in simple analytic operations like counting, ranking, scoring, ranges and more. Over time, probabilistic data structures have added the ability to analytically estimate millions and trillions of events, without requiring memory to store all of the events.
Set operations have made it possible to simplify comparisons, intersections, unions of sets – analytics that are usually complicated with data stores. RQL (Redis SQL) and secondary indexing, allows executing complex SQL queries on an existing Redis database. And finally recent modules like RediSearch, Neural Redis and Redis-ML have added advanced search and machine learning capabilities – not naturally occurring in any other databases.
With all of these possibilities, and with the move to automated decision making, we see increasing usage of Redis for data analytics scenarios.
Q10. How safe is a Redis server?
Yiftach Shoolman: The Redis enterprise server comes with client-based SSL authentication, built-in cloud firewall support (when running on public clouds), password authentication and role-based authorization that enables customizing security levels.
Qx. Anything else you wish to add?
Ofer Bengal: Redis is a game -changer when it comes to databases, and its progression over the last seven years has demonstrated that the industry and market are demanding performance and increasing flexibility to deal with all types of data processing, storage and analytic scenarios. Redis’ core values have always included high performance, high throughput and very low latencies. With the visionary addition of modules. The community has turned it into an all purpose datastore – suitable for any scenario that needs a database.
Ofer Bengal – Co-Founder and CEO of Redis Labs
Ofer is a serial entrepreneur who has founded and led several companies in the areas of data communications, telecommunications, Internet, homeland security and medical devices. Ofer was founder & CEO of RIT Technologies (NASDAQ: RITT), a provider of sophisticated telecommunications and data communications systems to major world carriers. He began his career as an aerospace engineer in the Israeli Air Force and then built his own aerospace engineering consulting firm. As a hobby, he has also invented, developed and licensed toy concepts to companies such as Milton Bradley, Hasbro and Tomy. Ofer holds a Bachelor of Science (cum laude) in aerospace engineering from the Technion, Israel Institute of Technology.
Yiftach Shoolman – Co-Founder and CTO of Redis Labs
Yiftach is an experienced technologist, having held leadership engineering and product roles in diverse fields from application acceleration, cloud computing and software-as-a-service (SaaS), to broadband networks and metro networks. He was the founder, president and CTO of Crescendo Networks (acquired by F5, NASDAQ:FFIV), the vice president of software development at Native Networks (acquired by Alcatel, NASDAQ: ALU) and part of the founding team at ECI Telecom broadband division, where he served as vice president of software engineering. Yiftach holds a Bachelor of Science in Mathematics and Computer Science and has completed studies for Master of Science in Computer Science at Tel-Aviv University.
Follow us on Twitter: @odbmsorg