On Amazon DocumentDB Global Clusters. Q&A With Meet Bhagdev

Q1. What is Amazon DocumentDB (with MongoDB compatibility)? 

Amazon DocumentDB (with MongoDB compatibility) is a fast, scalable, highly available, and fully managed document database service that is purpose-built to store and query JSON. Amazon DocumentDB is engineered with scalable workloads in mind, and allows compute and storage to scale independently. You can easily scale read capacity to millions of requests per second by adding up to 15 low latency read replicas across three Availability Zones (AZs) in minutes, regardless of data size. Amazon DocumentDB scales to 64TiB automatically.

Q2. Amazon DocumentDB is MongoDB compatible – What does it mean in practice? 

“MongoDB compatible” means that Amazon DocumentDB interacts with the Apache 2.0 open source MongoDB APIs. As a result, you can use the same MongoDB drivers, applications, and tools with Amazon DocumentDB with little or no changes. While Amazon DocumentDB supports a vast majority of the MongoDB APIs that customers actually use, it does not support every MongoDB API. Our focus has been to deliver the capabilities that customers actually use and need. Since launch, we have continued to work backwards from customers and have delivered an additional 80+ capabilities, including MongoDB 4.0 compatibility and transactions. 

Q3. What are the main reasons to use Amazon DocumentDB? 

There are four primary reasons to use DocumentDB. First, at its core, DocumentDB is designed to store, index, and query rich and complex JSON documents with high availability and scalability. You can retrieve documents based on nested field values, join data across collections, and perform aggregation queries. So, if you need schema flexibility and the ability to index and query rich structured and semi-structured documents, DocumentDB is a great choice.

Second, DocumentDB offers more flexibility compared to other document database vendors, because we provide customers with flexibility across instance, storage, and IOs that is advantageous in many scenarios, such as dev/test workloads.

Third, due to DocumentDB’s unique cloud native architecture, compute and storage are decoupled, allowing each to scale independently. Because storage and compute are separate, customers can add replicas without putting additional load on the primary. This allows you to easily scale out read capacity to millions of requests per second by adding up to 15 low latency read replicas across three AWS Availability Zones (AZs) in minutes. DocumentDB’s distributed, fault-tolerant, self-healing storage system auto-scales storage up to 64 TB per database cluster without the need for sharding, and without any impact or downtime to a customer’s application, making DocumentDB ideal for mission critical applications. You also enjoy the integration with AWS services such as VPC, CloudWatch metrics, and encryption-at-rest with custom KMS keys.

Finally, since DocumentDB supports MongoDB workloads and is compatible with the MongoDB API, it is a logical choice for MongoDB users who are looking to easily migrate to a fully managed database solution.

Q4. Who is using Amazon DocumentDB? And what do they use it for? 

We have customers in virtually every industry, from financial services to retail, from gaming to manufacturing, from media and entertainment to publishing, and more.  Amazon DocumentDB is being used today by a wide variety of customers, from enterprises like Dow Jones and BBC, to digital natives like Zulily.   Zulily as an example, is an online retailer that sells merchandise to mothers and their children.  They draw customers in by making shopping an event-based experience, where each day brings a new assortment of woman’s fashion, children items, and home décor to browse. Zulily utilized DocumentDB in combination with Amazon Kinesis to develop a recommendation engine that gathers search data and checks inventory levels to present products that align to popular searches.  You can see a detailed break down on how Zulily created this search and recommendation engine in this episode of This is My Architecture.  Readers can find detailed case studies on the Asahi Shimbun CompanyPunchhWootRappi, and many more.  The full list of DocumentDB customers can be found on our customer page.

Q5. You recently launched global clusters. What is it? and what is it useful for? 

Amazon DocumentDB Global Clusters is a new feature that provides disaster recovery from region-wide outages and enables low-latency global reads by allowing reads from the nearest Amazon DocumentDB cluster. A global cluster consists of a primary cluster that allows read and write operations, and up to five secondary clusters in other Regions, which allow read operations. Global Clusters is useful for workloads with a global footprint have strict availability requirements and may need to tolerate a region-wide outage with a very low Recovery Time Objective (RTO). Global Clusters is also useful for globally distributed workloads that have a need to serve read traffic to users around the world from a region closer to them. To summarize, Global Clusters helps you support critical, global workloads by automatically replicating data across multiple AWS regions, with sub-second latencies. 

Q6. What are the main benefits of deploying a cluster that spans across multiple AWS Regions? 

Global Clusters in Amazon DocumentDB have the following benefits:

  1. Disaster recovery from Region-wide outages – While uncommon, Global Clusters allow you to recover from Region-wide outages in less than 60 seconds. In the event of a Region-wide outage you can use Global Clusters to promote one of your secondary clusters to a standalone primary cluster which can handle both read and write traffic. 
  2. Global reads with low latency – If your application is globally distributed, you can use Global Blusters to replicate data to other Regions so that your users can read data from the secondary cluster in a Region that is closest to them. 
  3. Low-cost secondary clusters – The number and type of instances in the primary and secondary clusters don’t need to be the same. For example. you can create secondary clusters with one replica instance and scale up to 16 instances as needed. This enables you to setup multi-region deployments at a fraction of the cost when compared to other document database solutions. 

Q7. How does Global Clusters work? 

Global clusters use fast, storage-based physical replication of data from the primary Region to secondary clusters in other Regions. The compute instances provisioned in primary and secondary Regions don’t participate in replication, which frees them up for serving application requests. Through this storage-based replication, DocumentDB can replicate data across change events across regions typically in less than one second. You can get watch this video to find out how to get started with Global Clusters. 

Q8. Global clusters should help for disaster recovery from region-wide outages. How? 

A global cluster consists of a primary cluster that allows read and write operations, and up to five secondary clusters in other Regions, which allow read operations. For example, let’s say you have a primary cluster is in US East (Ohio), and two secondary clusters, one in US East (N. Virginia) and one in US West (Oregon). A rare region-wide failure in US East (Ohio) causes your primary cluster to become unavailable. With Global Clusters, you can promote one your secondary clusters in US East (N. Virginia)  or US West (Oregon).  to a standalone primary cluster with full read/write capabilities in less than one minute. You can then redirect your application to write to this newly promoted cluster endpoint and recover from the primary region’s unavailability. 

Q9. What are the prerequisites to run global clusters? 

To create an Amazon DocumentDB global cluster, you need an Amazon DocumentDB cluster to serve as a primary cluster. You can use an existing Amazon Document cluster or create a new one. Global clusters is available starting DocumentDB v4.0.

Q10. Anything else you with to add?

We have a free webinar on Global Clusters that helps potential customers learn how to enable a cluster that spans up to 5 AWS regions with little to no impact on performance.  In addition, viewers will learn how to leverage Global Clusters to enables low-latency global reads and provide disaster recovery for region-wide outages with a very low Recovery Time Objective (RTO).  Potential customers can also visit our DocumentDB product page and developer guide for additional details. 

……………………………………

Meet Bhagdev is a Senior Product Manager at Amazon DocumentDB. Meet works on driving product, strategy and roadmap for DocumentDB. Meet is deeply passionate about databases, open source and developer experiences. You can reach him at @meet_bhagdev (Twitter).
Prior to joining AWS in 2020, his career includes five years as a Product Manager in Azure, where he worked on and was responsible for launching several database and analytics services. Born in India, Meet moved to the United States in 2011 for college. He holds a Bachelor’s Degree in Computer Science from the University of California, Los Angeles (UCLA).

Sponsored by Amazon Web Services

You may also like...