On Amazon DocumentDB. Interview with Barry Morris
“We built DocumentDB to implement the Apache 2.0 open source MongoDB APIs, specifically by emulating the responses that a MongoDB client expects from a MongoDB server. We don’t support 100 percent of the APIs today, but we do support the vast majority that customers actually use. We continue to work back from customers and support additional APIs that customers ask for.” — Barry Morris.
I have interviewed Barry Morris, GM ElastiCache, Timestream and DocumentDB at AWS. We talked about DocumentDB
Q1. AWS has many database services now. Why DocumentDB? Why did you build it?
Barry Morris: At AWS we believe customers should choose the right tool for the right job, and we don’t believe there’s a one size fits all database given the variety and scale of applications out there. Customers using our purpose-built databases don’t have to compromise on the functionality, performance, or scale of their workloads because they have a tool that is expressly designed for the purpose at hand. In the case of Amazon DocumentDB (with MongoDB compatibility) we offer a fast, scalable, highly available, and fully managed document database service that is purpose-built to store and query JSON.
We built Amazon DocumentDB because customers kept asking us for a flexible database service that could scale document workloads with ease. Amazon DocumentDB has made it simple for these customers to store, query, and index data in the same flexible JSON format that is generated in their applications, so it is highly intuitive for their developers. And it achieves this expressive document query support while also maintaining the high availability, performance, and durability required for modern enterprise applications in the cloud. Similar to our other AWS purpose-built database services, Amazon DocumentDB is fully managed, so customers can scale their databases with clicks in the console rather than executing a planning exercise that takes weeks.
Finally, because many of our customers with document database needs are already enthusiastic about and familiar with the MongoDB APIs, we designed Amazon DocumentDB to implement the Apache 2.0 open source MongoDB APIs. This allows customers to use their existing MongoDB drivers and tools with Amazon DocumentDB, and to migrate directly from their self-managed MongoDB databases to Amazon DocumentDB. It also gives them the freedom to migrate data in and out of DocumentDB without fear of lock-in.
Q2. Who is using DocumentDB and for what?
Barry Morris: Amazon DocumentDB is being used today by a wide variety of customers, from longstanding global enterprises like Samsung and Capital One, to digital natives like Rappi and Zulily, to financial organizations like FINRA. In addition, several products that Amazon customers use, such as the Fulfillment by Amazon (FBA) experience on Amazon.com, are powered by Amazon DocumentDB. We have customers in virtually every industry, from financial services to retail, from gaming to manufacturing, from media and entertainment to publishing, and more.
Many of our customers are software engineering teams who don’t want to deal with the “undifferentiated heavy lifting” of database administration, such as hardware provisioning, patching, setup, and configuration. These organizations would rather allocate their valuable engineering talent to building core application functionality, rather than deploying and managing MongoDB clusters. One of our customers, Plume, saved themselves the cost of “three to five approximately $150,000 Silicon Valley salaries” which both offset the managed service cost and allowed their team to focus on their core mission to deliver a superior wireless internet experience. Further, DocumentDB allows Plume to scale much more than their previous solution, with one of their clouds handling as many as 50,000 API calls per minute. You can read the full case study here.
The customer use cases are wide and many, given that document databases offer both flexible schemas and extensive query capabilities. Some of the more traditional use cases for document databases include catalogs, user profiles, and content management systems; and with the scale that AWS and Amazon DocumentDB provide, we are seeing customers deploy document databases for a much wider range of internet-scale use cases, including critical customer-facing e-commerce applications and production telemetry.
Q3. What has been the customer response?
Barry Morris: As with all AWS services, we work very closely with DocumentDB customers to ensure we are building a service that works backward from their needs. To date, the feedback we get is that customers are thrilled by DocumentDB’s ease of scaling, its fully managed capabilities, its natural integration with other AWS offerings, its durability and general enterprise-readiness, and its straightforward API compatibility with MongoDB. Of course, we are always working to add capabilities and features that are highly requested. For example, we just improved our MongoDB compatibility by adding support for frequently requested APIs such as renameCollection, $natural, and $indexOfArray. In the coming months, we also plan to release one of our most-requested features, Global Clusters, for customers with cross-region disaster recovery and data locality requirements. We also continue to bolster our MongoDB compatibility by adding support for the APIs that customers use the most.
Q4. What are the main design features of Amazon DocumentDB?
Barry Morris: Amazon DocumentDB has been built from the ground up with a cloud native architecture designed for scaling JSON workloads with ease. An essential design feature of DocumentDB is that it decouples compute and storage, allowing each to scale independently. Because storage and compute are separate, customers can add replicas without putting additional load on the primary. This allows you to easily scale out read capacity to millions of requests per second by adding up to 15 low latency read replicas across three AWS Availability Zones (AZs) in minutes. DocumentDB’s distributed, fault-tolerant, self-healing storage system auto-scales storage up to 64 TB per database cluster without the need for sharding, and without any impact or downtime to a customer’s application.
As I mentioned before, DocumentDB is built to be enterprise-ready. It provides strict network isolation with Amazon Virtual Private Cloud (VPC). All data is encrypted at rest with AWS Key Management Service (KMS) and encryption in transit is provided with Transport Layer Security (TLS). DocumentDB has compliance readiness with a wide range of industry standards, and automatically and continuously monitors and backs up to Amazon S3, which is highly durable.
Q5. When would you suggest to use DocumentDB vs another purpose-built database?
Barry Morris: At its core, DocumentDB is designed to store, index, and query rich and complex JSON documents with high availability and scalability. You can retrieve documents based on nested field values, join data across collections, and perform aggregation queries. So if you need schema flexibility and the ability to index and query rich structured and semi-structured documents, DocumentDB is a great choice. This is particularly true if you have JSON document workloads that are mission critical for your organization. A DocumentDB cluster provides 99.99% availability, can handle tens of thousands of writes per second and millions of reads per second, and supports up to 64 TiB of data. Finally, since DocumentDB supports MongoDB workloads and is compatible with the MongoDB API, it is a logical choice for MongoDB users who are looking to easily migrate to a fully managed database solution. Every use case is unique, and it is often a good idea to engage an AWS solution architect (SA) if you have questions about selecting the right database for your next application.
Q6. What are the key advantages of DocumentDB vs managing your own cluster?
Barry Morris: For many customers, fully managed is all about scale. We scale your database at the click of a button, saving you nights and weekends of scaling clusters manually. Customers don’t have to worry about provisioning hardware, running the service, configuring for high availability, or dealing with patching and durability. These concerns are shifted to AWS, so our customers can focus on their applications and innovate on behalf of their customers. Something as simple as backup and restore can be a drag on production. With DocumentDB, backup is on by default.
Cost is also a big concern when managing your own clusters. This can include the cost of labor resources, hardware investments, vendor software solutions, support costs, and more. Cost becomes very transparent with DocumentDB, as it offers pay-as-you-go pricing with per second instance billing. You don’t have to worry about planning for future growth, because DocumentDB scales with your business.
Q7. Tell me about “MongoDB compatibility” – what does that really mean in practice?
Barry Morris: That’s a great question and one we get a lot from customers. We built DocumentDB to implement the Apache 2.0 open source MongoDB APIs, specifically by emulating the responses that a MongoDB client expects from a MongoDB server. We don’t support 100 percent of the APIs today, but we do support the vast majority that customers actually use. We continue to work back from customers and support additional APIs that customers ask for. Because we offer MongoDB API compatibility, it’s straightforward to migrate from the MongoDB databases you’re managing on premises or in EC2 today to DocumentDB. Updating the application is as easy as changing the database endpoint to the new Amazon DocumentDB cluster.
Q8. Let’s hear about some exciting customer momentum. Can you please share some customer stories?
Barry Morris: We have a lot of them! Customers including BBC, Capital One, Dow Jones, FINRA, Samsung, and The Washington Post have shared their success stories with us. Recently, we’ve done some deeper-dive case studies with customers in a range of industries.
For example, Zulily presented their solution at AWS re:Invent 2020. The popular online retailer is using Amazon DocumentDB along with Amazon Kinesis Data Analytics to power its “suggested searches” feature. In this solution, Kinesis Data Analytics filters relevant events from clickstream analytics when a Zulily customer requests a search, a Lambda function performs a lookup for brands and categories relevant to those events, and the resulting enriched events — which populate the suggested search — are stored in DocumentDB. The feature has been a hit, with more than 75% of Zulily customers using suggested searches when they search the online store.
A customer story that is particularly compelling given recent events is Rappi. Rappi is a successful Colombian delivery app startup that operates in nine Latin American countries. The company had been rearchitecting their monolithic application into a more flexible, microservices-driven architecture to help it scale as it grew. As part of this modernization effort, the startup selected DocumentDB as a fully managed, purpose-built JSON database service to replace its self-managed MongoDB clusters, which were becoming unwieldy to manage at scale. When Covid-19 hit, the company faced an unprecedented surge in orders and deliveries. DocumentDB enabled them to handle the surge because, as a highly scalable service, it operated as normal despite the change in volume. Overall, Rappi decreased management and operational overhead by more than 50% using Amazon DocumentDB.
A final one I will mention is Asahi Shimbun, which is one of Japan’s oldest and largest-circulated newspapers. The company overhauled its digital app last year using AWS and selected Amazon DocumentDB as their content master database to store their articles. Since modernizing, Asahi Shimbun has seen a 30% reduction in monthly operation costs for extracting past articles and a 20% improvement in frequency of use for the app. This is one of many examples that showcase how essential AWS is for industries like publishing, retail, and banking that are evolving with new business models in the cloud.
You can peruse these and many other customer case studies in full on our website.
Q9. Anything else you wish to add?
Barry Morris: Over the last decade, JSON/document-based workloads have become one of the primary alternatives to relational approaches, for a wide range of applications with requirements for flexible data management. We expect this trend to keep growing, particularly with cloud-native applications, and we’re excited to offer DocumentDB as a tool in the toolkit of modern builders leveraging JSON. It’s been great to see DocumentDB support the needs not only of customers who are migrating their existing MongoDB workloads to the cloud, but also the builders who are creating modern applications and choosing DocumentDB as the right “purpose-built database” for their needs.
For anyone interested in learning more and getting hands-on with DocumentDB, we have a number of things coming up that may be of interest. We will be hosting two DocumentDB Focus Days, which are virtual workshops on best practices, in May and June. You can learn more and sign up on the registration page. Finally, we have an ongoing Twitch series where our solution architects (SAs) dive deeper on DocumentDB functionality, which you can learn more about on the website. Our DocumentDB product detail page is the best place to start for a general overview of the service and steps to get started, and you can refer to the documentation for an in-depth developer guide.
Barry Morris, GM ElastiCache, Timestream and DocumentDB. As General Manager of ElastiCache, Timestream and DocumentDB, Barry manages a number of businesses in the AWS database portfolio. He is focused on delivering value to AWS customers through trusted data management services, with a relentless commitment to database innovation.
Prior to joining AWS in 2020, his career includes over 20 years as the CEO of international technology companies, both private and public, including Undo.io, NuoDB, StreamBase, Headway, and IONA Technologies. Barry has also had leadership roles in PROTEK, Metrica, Lotus Development and DEC.
Born in South Africa, Barry lived in England and Ireland before moving to Boston. He holds a Bachelor’s Degree (BA) in engineering from Oxford University and an Honorary Doctorate in Business Administration (DBA) from the IMCA.
– Get Started with Amazon DocumentDB
– From SQL to NoSQL. Interview with Carlos Fernández. by Roberto V. Zicari.ODBMS Industry Watch, April 30, 2021
Follow us on Twitter: @odbmsorg