Rethinking what’s possible with innovations in Amazon RDS at AWS re:Invent. Q&A with Sirish Chandrasekaran

Q1. What is Amazon Relational Database Service? Why is innovation an important part of the service? 

In 2009, AWS pioneered fully-managed cloud database services for open-source database management systems (DBMS) with the launch of Amazon Relational Database Service, or Amazon RDS. Our goal was to accelerate the adoption of open-source databases, like MySQL and PostgreSQL, so developers can focus on creating new applications while removing the worries associated with supporting databases. Amazon RDS has grown over the past decade to support seven managed database engines, including Amazon Aurora. Amazon RDS offers benefits ranging from database operational features to security best practices. With Amazon RDS, we make it simple to set up, operate, and scale databases in the cloud. Of its seven database engines, five of the Amazon RDS engines are open source or open source compatible: Amazon Aurora PostgreSQL-Compatible EditionAmazon Aurora MySQL-Compatible EditionAmazon RDS for PostgreSQLAmazon RDS for MySQL, and Amazon RDS for MariaDB.

At AWS, innovation is at the core of everything we do. We are committed to ensuring that the things we build are driven to solve the complex database problems that customers have, and Amazon RDS is no exception to this strategy. As part of this, we provide our customers with a breadth of open-source services to build on. As a result, developers benefit from both the innovations driven by AWS, but also advancements from familiar open-source communities. A great example of open-source innovation for Amazon RDS customers is when RDS for MariaDB announced support for MariaDB 10.6. This release included capabilities already available to the MariaDB community, such as support for the MyRocks storage engine, Oracle PL/SQL compatibility, and Atomic DDL support. Simultaneously, developers who love MariaDB can enjoy AWS-led innovations like Amazon RDS Blue/Green Deployments, which automates an advanced DevOps operational technique that I’ll discuss in more detail a bit later. 

Q2. At AWS re:Invent this year, a number of capabilities were announced for Amazon RDS, specifically for its RDS for PostgreSQL, RDS for MySQL, and RDS for MariaDB services. Can you briefly introduce each of these?

Yes, at re:Invent, we announced four exciting new capabilities for our Amazon RDS users. First, we have Trusted Language Extensions for PostgreSQL, a development kit and open-source project that makes it easier to quickly build and safely run PostgreSQL extensions on Amazon Aurora, Amazon RDS, and any other PostgreSQL database. Aurora MySQL-Compatible, RDS for MySQL, and RDS for MariaDB users can now perform safer, simpler, and faster database updates with zero data loss using Amazon RDS Blue/Green Deployments. Finally, RDS for MySQL users looking to enhance the performance of their databases can use Amazon RDS Optimized Writes for up to 2x improved write transaction throughput and Amazon RDS Optimized Reads for up to 2x faster query processing.

Q3. How does Trusted Language Extensions for PostgreSQL solve for issues that PostgreSQL users have today when using extensions? 

PostgreSQL has become one of the most popular relational database engines for developers, and one key reason is its extensive library of extensions. Today, AWS has certified 85+ extensions for use in Amazon Aurora and Amazon RDS. We hear from developers that they want to be able to use the larger library of PostgreSQL extensions in a managed database.

This prompted us to create Trusted Language Extensions (or TLE) for PostgreSQL, a new development kit and open-source project where PostgreSQL developers can now quickly deploy high performance extensions to production in Amazon Aurora, Amazon RDS, and any other PostgreSQL database using trusted programming languages, including JavaScript, PL/pgSQL, Perl, and SQL. TLE is designed to protect user databases by limiting access to resources to constrain extension defect(s) to a single database connection. Furthermore, database administrators have fine-grained control over who can install an extension and can create a permissions model for running them. As a result, developers can move forward with an extension as soon as they determine that it meets their needs. Independent Software Vendors looking to provide their extensions to Amazon Aurora and Amazon RDS can now do so using TLE. 

Q4. How can members of the open-source PostgreSQL community engage in this project? How can I learn more about the project? 

TLE is an open-source project, so anyone can use and contribute to the project using the official Trusted Language Extensions GitHub repo. To build and test TLE in your local PostgreSQL database, developers can utilize the source code after cloning the repository. From new features, example extensions, additional documentation, to bug reports, all contributions are welcome. 

For developers looking on how to get started with contributions, we recommend reviewing the existing GitHub issues labeled with enhancementsbugduplicatehelp wanted, and the like for any ‘help wanted’ issues. Then, if you have something you’d like to contribute, we ask that you ensure you are working against the latest source on the main branch. Also, you should check existing, open, or recently merged pull requests to ensure the contribution hasn’t been addressed already and open an issue to discuss any significant work. From there, when you are ready to make a request, the steps to submit are to fork the repository, modify the source — focusing on the specific change you are contributing, ensure your local tests pass, and commit the change to your fork using clear commit messages. From there, you may send the TLE team a pull request, pay attention to any automated CI failures reported in your request, and keep involved in the conversation. 

Q5. How do customers make updates, such as major or minor version upgrades or schema changes, to their Amazon RDS databases today? 

Prior to re:Invent 2022, Amazon Aurora and Amazon RDS customers could update their databases using one of two methods. First, they could do an “in-place database update.” Using this method, they would simply overwrite older databases with a new version that has their desired change implemented. This method of updating risks the safety of your production database and can lead to long, unpredictable downtime. The second option is for customers to use Amazon RDS database cloning and Amazon RDS Read Replicas to self-manage a staging environment and keep it up-to-date with the production environment. Customers would perform the desired changes on this environment, and then, manually promote the staging environment to production. This method of making changes is costly to build and manage, and requires considerable orchestration of resources and careful planning. 

Q6. How does Amazon RDS Blue/Green Deployments make database updates on Amazon RDS safer, simpler, and faster? 

AWS has elevated the process for database updates with Amazon RDS Blue/Green Deployments,  a new enterprise-class feature that brings best-in-class database update processes to Aurora MySQL-Compatible, RDS for MySQL, and RDS for MariaDB. With Blue/Green Deployments, developers, DBAs, and DevOps professionals can make various database updates, including major or minor version upgrades, schema changes, instance scaling, engine parameter changes, and maintenance updates with zero impact to their production workload. With just a few clicks, Blue/Green Deployments simply create a staging environment or “green environment” that is a copy of the current production environment or “blue environment”, including its primary instance, in-Region replicas, and enabled features. Blue/Green Deployments keeps them in sync using logical replication. 

With a single click, you can promote the staging environment to be the new production as fast as a minute with no application changes and zero data loss. While promoting your databases to production, Blue/Green Deployments protect production workloads using switchover guardrails. These guardrails block writes to blue and green environments during switchover to ensure customers’ green environment is up-to-date with the blue before promotion. They also check for replication errors, assess instance health, detect long-running transactions, and time out your promotion if it exceeds the maximum tolerable downtime that you set. We are excited to bring customers this capability because we have made an advanced DevOps technique easily available to Aurora MySQL-Compatible, RDS for MySQL, and RDS for MariaDB customers. As a result, these customers no longer have to choose between availability and new feature benefits. 

Q7. In 2022, Amazon RDS announced a number of capabilities that improved database performance. Can you talk a little bit about these? Why is database performance a priority for Amazon RDS? 

We understand that we live in a world where a business can lose a customer’s attention in seconds and expect results in the blink of an eye – and there’s no sign of this trend slowing down. As a result, over the past couple years, we have strived to empower our customers with features that allow them to optimize and maximize the speed of their databases. With AWS Graviton2 based instances, customer can realize up to a 35% price/performance improvement on Amazon Aurora and up to 52% price/performance improvement on Amazon RDS. Recently, we announced Amazon RDS Multi-AZ with two readable standbys, a feature that provides up to 2x faster transaction commit latency, automatic failovers in typically under 35 seconds, and increased read capacity. In May, RDS for PostgreSQL announced support for up to a maximum of 155 read replicas with cascaded read replicas. This capability, already available on RDS for MySQL and RDS for MariaDB, provides users with three levels of cascaded read replicas. Customers can create single-AZ or Multi-AZ cascaded read replica database instance in the same region or any one cross-Region from another read replica instance. Then, in October, Amazon RDS provided RDS for MySQL, RDS for MariaDB, and RDS for PostgreSQL developers support of up to 15 read replicas per instance, including up to 5 cross-Region read replicas, further improving read capacity. However, we didn’t stop here and are glad to have brought two additional performance enhancing features this re:Invent. 

Q8. What new performance features did you launch for Amazon RDS at re:Invent? 

We launched two exciting new features for our Amazon RDS for MySQL customers at re:Invent this year. We launched Amazon RDS Optimized Writes, which improves customers’ performance by optimizing how the MySQL engine writes today with an up to 2x improvement in write transaction throughput. We also launched Amazon RDS Optimized Reads, which improves how RDS for MySQL and RDS for MariaDB handle complex queries by processing them up to 2x faster. 

Q9. How does MySQL write databases today? How does Amazon RDS Optimized Writes improve this process? 

When running a relational database, developers and database administrators expect a certain level of durability to protect their data. To do this, today, MySQL handles database writes by writing 16 kibibyte (KiB) in-memory data pages to storage in four KiB chunks. However, if there is a system failure some of these chunks may not get written to storage and become corrupted. To protect from this, MySQL uses what is called a “doublewrite buffer” so data is first written to the buffer and then to storage. This means even in the event of a failure and the data in storage is corrupted, there will still be an intact copy of the written data in the buffer. While this protects MySQL users from data loss, it usually takes twice the amount of time versus writing once, requires twice as much IOPS bandwidth, and negatively affects database throughput and performance. If a workload has high concurrent transactions, developers may need to provision additional IOPS to be able to meet performance requirements. 

Amazon RDS Optimized Writes is a new feature of RDS for MySQL that improves write transaction throughput by up to 2x at no additional cost. Optimized Writes protect customer data by writing 16 KiB pages to storage in a single atomic operation. To write to the database in a single step, Optimized Writes uses the recently announced Torn Write Prevention (TWP), a new feature of the AWS Nitro System, to ensure the writes are safely written to table storage and protected from failure while writing. Optimized Writes is available to RDS for MySQL customers deploying on new RDS for MySQL databases, using supported instances and versions 8.0.30 and higher.

Q10. Similarly, how do Amazon RDS for MySQL and Amazon RDS for MariaDB process queries today? How does Amazon RDS Optimized Reads accelerate query processing by up to 2x?

When processing queries, RDS for MySQL and RDS for MariaDB read from Amazon Elastic Block Store (EBS). While this process is efficient for most workloads, those that deal with complex queries, such as those that use complex grouping or sorting, require that RDS for MySQL and RDS for MariaDB generate temporary objects. When these objects don’t fit into memory, they are moved to the disk storage. In the case of Amazon RDS, this means that temporary objects would be written-to and read-from EBS.

To speed up complex query processing on RDS for MySQL and RDS for MariaDB, customers can now use Amazon RDS Optimized Reads for up to 2x faster query processing. Optimized Reads support complex queries that use temporary tables, such as queries involving sorts, hash aggregations, high-load joins, and Common table Expressions (CTEs). Rather than place temporary tables on EBS, Optimized Reads place complex queries’ temporary tables directly on the database instance’s local NVMe storage, allowing you to process queries up to 2x faster than before. We are excited to have implemented Optimized Reads for RDS for MySQL customers on versions 8.0.28 and higher and for RDS for MariaDB customers on versions 10.4.25, 10.5.16, 10.6.7 and higher.

Q11. It appears zero ETL will be a big focus so can you tell us more about the Amazon Aurora zero-ETL with Amazon Redshift?

For developers looking to derive near real-time benefits from their data, one of their biggest challenges is the need to perform extract, transform, load (ETL) operations to move transactional data from an operational database into an analytics data warehouse. It is a requirement that often leaves engineers having to construct and maintain complex data pipelines. With Amazon Aurora’s zero-ETL integration with Amazon Redshift, it is simpler for customers to bring zero ETL integration between Aurora and Amazon Redshift, our data warehouse. With zero-ETL, customers can enable near real-time analytics and machine learning (ML) using Amazon Redshift on petabytes of transactional data from Aurora. Within seconds of transactional data being written into Aurora, the data is available in Amazon Redshift, so customers don’t have to build and maintain complex data pipelines to perform ETL operations. 

Q12. What else launched for Amazon Aurora?

For customers looking to further protect their data, they can now do so with Amazon GuardDuty RDS Protection, available in preview. GuardDuty RDS Protection uses ML anomaly detection to detect suspicious login attempts to Aurora databases that are indicative of early stages of data exfiltration and ransomware attacks. Customers can enable GuardDuty RDS Protection with just a few clicks without impacting operational database performance or requiring modifications. This extension of Amazon GuardDuty delivers key benefits for protecting data and database workloads, including proactively identifying potential threats to the data that’s stored in customer Aurora databases and being able to monitor all login activity to existing and new Aurora databases in their account. 

Q13. Anything else you wish to add? 

We are very excited to have brought all of these innovations for the customers of Aurora and Amazon RDS this year. To learn more about these and future innovations, explore our Aurora and Amazon RDS product pages as well as our documentation pages. If you have questions, go to Contact Us and you’ll hear back from AWS in one business day to discuss how AWS can help your organization.

………………………………….

Sirish Chandrasekaran

GM Amazon RDS Open Source Databases (PostgreSQL, MySQL MariaDB)

Sirish Chandrasekaran, GM and Engineering Director for Amazon RDS Open Source Databases, overseeing the RDS for PostgreSQL, RDS for MySQL, and RDS for MariaDB services. Prior to that, Sirish was an engineering and product leader for Amazon Aurora with MySQL compatibility. Prior to joining Amazon in 2016, Sirish was an executive at Dropbox, where he held various roles focused on their B2B offering (Dropbox for Business) helping the company scale though a sustained period of hypergrowth. Before Dropbox, Sirish was an Associate Partner at McKinsey & Company, serving companies globally across a range of strategy, operations, and technology issues. Sirish received his PhD and MS in Computer Science from U.C. Berkeley, and B. Tech in Computer Science and Engineering from IIT, Madras.

Sponsored by AWS

You may also like...