On Amazon MemoryDB. Q&A with Jonathan Fritz

Q1. What are the key technical challenges when using microservices for a software architecture composed of many small independent services that communicate with each other?

Developers building applications that require high performance at massive scale use microservices, splitting application functionality into separate, independent services to make the applications easier to deploy, manage, and scale. However, applications built using microservices also increase the complexity of the underlying code base. Microservices applications demand extremely low latency and high throughput because they often involve hundreds of microservices per user interaction or API call. Microservices also need low-latency, high-throughput data stores to act as queues and state storage between services. To simplify the way they build these applications, developers look for databases with easy-to-use data models, enabling the access patterns their services might need, and flexible data structures instead of a more limited, key-value storage. 

Q2. Database performance is critical to the success of interactive applications. To reduce read latency to microseconds, you can put an in-memory cache in front of a durable database. What are the limitations of such a solution?

Although caching can speed up access and reduce load on a primary database, there are some complexities in this architecture. First, you must manage two different systems (a primary database and a cache). This adds complexity in scaling and tuning each system independently and managing two different engines. Second, your application code must deal with caching patterns like cache invalidation, which add complexity to your code. Finally, you face the added expense of running both a database and a cache. However, if you don’t size your cache to store 100% of your data, you hit significant performance bottlenecks when accessing data in the database.

Q3. You just announced Amazon MemoryDB for Redis. What is it?

Amazon MemoryDB for Redis is a fully managed, Redis-compatible, in-memory database that provides low latency, high throughput, and durability at virtually any scale. MemoryDB is purpose-built to make it easier to build applications that require a durable database, microsecond read latency, and single-digit millisecond write latency. Plus, it stores entire datasets in-memory and can be used as a single, primary database. This means you can build high-performance applications with microservices without having to separately manage a cache, durable database, or the required underlying infrastructure. Compatible with the popular Redis API, MemoryDB enables developers to quickly build applications using the same familiar Redis application code, data structures, and commands they use today. Data in MemoryDB is always encrypted and stored with high durability in a transactional log across multiple Availability Zones (Multi-AZ) to enable fast database recovery and restart. 

Q4. What is the difference between MemoryDB and Amazon ElastiCache for Redis? 

MemoryDB for Redis is a durable, in-memory database. Consider using MemoryDB if your workload requires a durable database that provides ultra-fast performance (microsecond read and single-digit millisecond write latency). If you’re building an application using Redis data structures and APIs and need a primary, durable database, MemoryDB may be a good fit for your use case. Finally, you should consider using MemoryDB to simplify your application architecture and lower costs by replacing usage of a database with a cache for durability and performance.

ElastiCache for Redis is a caching service that is commonly used to cache data from other databases and data stores. Consider ElastiCache for Redis for caching workloads where you want to accelerate data access with your existing primary database or data store (microsecond read and write performance). Also consider ElastiCache for Redis for use cases where you want to use the Redis data structures and APIs, but don’t require a durable database. 

Q5. What does it mean to be “Redis compatible”?

When we say “Redis-compatible,” we mean that MemoryDB supports all Redis data structures and essentially all Redis commands. The only commands we do not support are Redis admin commands because MemoryDB already automates those actions to take the burden of management away from customers. Customers who are running an application with Redis can easily migrate it over to MemoryDB with no code changes because of the Redis API compatibility.

Q6.  Why not use open source Redis directly instead? 

Open source Redis does not have the durability and consistency features offered by MemoryDB, which many customers require for their primary databases. Open source Redis includes an optional append-only file (AOF) feature, which persists data in a file on a primary node’s disk for durability. However, because AOF stores data locally on primary nodes in a single Availability Zone (AZ), there are risks for data loss. Also, in the event of a node failure, there are risks of consistency issues with replicas. Open source Redis allows writes and strongly consistent reads on the primary node of each shard and eventually consistent reads from read replicas. These consistency properties are not guaranteed if a primary node fails, as writes can become lost during a failover and thus violate the consistency model.

MemoryDB leverages a distributed transactional log to durably store data across multiple AZs. By storing data across multiple AZs, MemoryDB has fast database recovery and restart. Also, MemoryDB offers eventual consistency for replica nodes and consistent reads on primary nodes. The consistency model of MemoryDB is similar to open source Redis. However, in MemoryDB, data is not lost across failovers, allowing clients to read their writes from primaries regardless of node failures. Only data that is successfully persisted in the Multi-AZ transaction log is visible. Replica nodes are still eventually consistent, with lag metrics published to Amazon CloudWatch.

MemoryDB’s database design enables it to store data durably with database fast recovery and restart, allowing customers to use it as a primary, in-memory database.

Q7. Could you dive into the benefits of MemoryDB?

MemoryDB provides low latency, high throughput, and durability at virtually any scale. 

First, MemoryDB offers microsecond read and single-digit millisecond write latencies and supports supports millions of transactions per second.

Second, it offers flexible and friendly Redis data structures and APIs and has full compatibility with Redis commands (except for a few admin commands). Redis has been voted the “Most Loved” database by Stack Overflow developers for 5 consecutive years. Developers love Redis because its unique API makes building applications faster and easier.

Third, MemoryDB provides Multi-AZ data durability using an underlying durable transaction log component. You can create highly available clusters with up to five read replicas across different Availability Zones.

Fourth, MemoryDB is fully managed and handles software configuration, monitoring, snapshots, and upgrades for you.

Fifth, MemoryDB is highly scalable. You can scale a cluster to up to 500 nodes, storing over 100 TB of data (this is assuming 250 shards with one replica). You can scale for higher write throughput by adding new shards, or scale for read throughput by adding new replicas per shard. You can also scale vertically by choosing larger or smaller node types for your cluster.

Sixth, MemoryDB offers enterprise-grade security, including encryption at-rest with AWS Key Management Service (KMS) keys, encryption in-transit with TLS, Graviton2 in-memory encryption, support for Amazon VPC, and fine-grained user authentication and authorization using Redis Access Control Lists (ACL).

Lastly, MemoryDB supports AWS Graviton2 instance types, providing better performance at a lower cost.

Q8. Can you illustrate some use cases?

Consider MemoryDB when you need extremely low latency (microsecond read, single-digit millisecond write), durability, and high throughput. In retail, it could be serving customer profiles and accounts, or inventory tracking and fulfillment. In gaming, it might be serving leaderboards, player data stores, or session history. Additionally, the Redis engine provides the ability to serve information beyond key-value lookups. Flexible data structures can be used to consume and aggregate real-time events for streaming and analytical use cases. For instance, you can ingest data from event-driven sources like click-stream, IoT, and mobile applications using Redis streams. Then you can leverage in-memory Redis data structures like sorted set and hashes to model and maintain in-memory aggregations. You can also scale and coordinate the processing of your events using Redis Consumer Groups to provide real-time insights.

Q9. How can you achieve microsecond read latency, single-digit millisecond write latency, and Multi-AZ durability for applications with microservices architectures?

You can achieve this by using MemoryDB as your database for your microservices applications. MemoryDB provides millisecond read and single-digit microsecond write latencies with high throughput. You can use MemoryDB as the hot data tier for your microservices applications. Also, Redis provides APIs like Pub/Sub to make it easy to build queues for passing data between microservices.

Q10. How do you create a MemoryDB cluster?

You can easily and quickly create a MemoryDB cluster with a few clicks from the AWS Console, or using the AWS Command Line Interface (CLI) or AWS Software Development Kit (SDK). 

Q11. When is appropriate to scale a MemoryDB cluster horizontally – by adding or removing nodes-, and when vertically – by moving to larger or smaller node types?

You can easily scale your MemoryDB cluster with a few clicks in the AWS Management Console, or using the AWS CLI or AWS SDK. Consider scaling horizontally when you need more write or read throughput. For write throughput, you can add more shards, and for read throughput, you can add more read replicas per shard. Consider scaling vertically when you require more memory and computer per node. You can change the node type used in your cluster to a larger or smaller type.

Q12. MemoryDB supports write scaling with sharding and read scaling by adding replicas. What does it mean in practice?

In practice, you can use horizontal scaling for write and read performance to scale your cluster as your workload requirements change. For write performance, you can add a shard and spread out your Redis keyspace over more primary nodes, giving more resources serving writes. For reads, you can add read replicas to a shard to scale out the number of nodes serving reads. Each of these scaling actions helps match your cluster resources to workload requirements to avoid overprovisioning, which keeps your costs low as you scale.

Q13. How do I learn more about Amazon MemoryDB for Redis?

Here are some resources to learn more MemoryDB:

Jonathan FritzHead of product management for in-memory database and caching services, AWS 

Jon has served in a variety of leadership positions across database, analytics, and blockchain technologies at AWS since he joined in 2013. Currently, he leads product management for the AWS In-Memory Database and Caching Services, including ElastiCache for Redis, ElastiCache for Memcached, and MemoryDB for Redis. Prior to that, he founded the AWS blockchain organization and launched the Amazon Managed Blockchain service. Jon also served as the Head of Product Management for Amazon EMR, a big data and machine learning service supporting the Apache Spark and Apache Hadoop ecosystems.

Jon holds an MBA from Stanford Graduate School of Business and a BA in chemistry with a minor biology from Washington University in St. Louis.

Sponsored by AWS.

You may also like...