Advantages of Amazon Aurora over traditional MySQL & PostgreSQL. Q&A with Colin Mahony and Pravin Mittal
Q1. What is Amazon Aurora?
Colin: Amazon Aurora is a cloud-native database that is fully compatible with open-source MySQL and PostgreSQL. Aurora supports the entire surface area of open-source MySQL and PostgreSQL functionality and offers drop-in compatibility for applications running on these databases. Aurora was built for customers who need a fully managed database service that has the cost and simplicity of open-source databases and the performance of a commercial database.
What Aurora uniquely offers is a track record of operational excellence and enterprise features, at one-tenth the cost of commercial databases. Aurora offers unparalleled performance, availability, and security at global scale. Since its launch in 2014, Aurora has been the fastest growing service in the AWS portfolio.
Q2. What are three main benefits of Amazon Aurora?
Pravin: The first main benefit of Aurora is that it is MySQL and PostgreSQL compatible designed to power performance intensive, enterprise workloads. Open-source databases are the top choices for most organizations, big or small. The allure of open-source databases is, in large part, based on the desire to break free from punitive licensing agreements and daunting audits related to commercial databases. With Aurora, organizations can continue to use highly popular open-source databases and associated tools with the benefit of a managed service for enterprise workloads. You can migrate to Aurora using standard open-source tools and snapshots, and with the drop-in compatibility, applications built on open-source databases require little to no change to work with Aurora. You may also create an Aurora read replica from an existing RDS for MySQL and RDS for PostgreSQL database with just a click in our management console.
The second main benefit of Aurora is that it is designed for unparalleled high performance and availability, with techniques and features which are simply inaccessible via open source databases. Aurora uses a variety of software and hardware techniques to ensure the database engine is able to fully use available compute, memory, and networking. These techniques result in cost efficiencies that we pass on to customers. For example, Aurora’s architecture separates compute from storage, allowing each to scale independently. Aurora storage automatically scales up to 128 TiB and maintains 6 copies across 3 Availability Zones (AZ). Although you only pay for one copy of data, you gain from the durability achieved by maintaining 6 copies across AZs and parallelism through the processing power of hundreds to thousands of storage nodes. On compute, Aurora offers provisioned or Aurora Serverless to automatically scale based on application needs. Aurora also supports up to 15 read replicas, which help in scaling out read-only workloads while also providing high availability for your workloads when placed across AZs. In addition, with Amazon Aurora Global Database, you can expand your database in up to 5 regions for multi-region resiliency against disasters and also to serve low-latency reads closer to your end customers, with up to 15 read replicas supported in each region.
The third main benefit of Aurora is that it is fully managed and maintained by AWS, allowing your DBAs and development teams to focus exclusively on application development and schema management. Your teams are freed from time-consuming database tasks such as server provisioning, patching, backups, configuring high availability and disaster recovery. Aurora provides continuous monitoring and self-healing storage that automatically scales and shrinks. With Amazon Aurora Serverless, the database compute capacity automatically scales up or down based on your application’s needs. You only pay for what you use, so this is a very cost effective approach for utilizing resources. Many of these core components of Aurora are fully automated through our streamlined control plane and advanced diagnostic mechanisms. In contrast, open source databases require users to dedicate valuable engineering and database administrator hours for undifferentiated tasks.
Q3. Can you expand on Aurora’s drop-in compatibility with open-source MySQL and PostgreSQL?
Colin: We have invested extensively in making Aurora fully compatible with open-source MySQL and PostgreSQL databases for a consistent experience. Application code, drivers, and tools you already use today with your MySQL or PostgreSQL databases can be used with Aurora with little to no changes. Aurora also comes with differentiating features like Serverless, Global Database, and Parallel Query. Aurora has built-in integrations with many AWS services. Integration with Amazon DevOps Guru for RDS helps you easily improve performance of your database. Aurora ML helps you execute machine learning models hosted in Amazon Sagemaker and Amazon Comprehend using intuitive SQL commands. Also, fast cloning of your database with Aurora is a very useful feature for development, testing, experiments, or analytics.
Q4: How does Aurora achieve high performance and scalability?
Pravin: With Aurora, we re-imagined how relational databases can take advantage of a cloud infrastructure, and Aurora’s storage architecture is part of this. Since all the data is always hardened in Aurora storage that maintains six copies across 3 AZs, compute nodes can be easily started, scaled, and stopped based on workload requirements without any data loss or disruption. You can add up to 15 low-latency (<30 millisecond) read replicas that connect to the same storage layer to scale your read-only workloads. With Aurora Serverless v2, we automatically adjust your database compute capacity in a fraction of a second to closely match the needs of your application to save costs. Aurora offers fast recovery from failures. If a writer fails, a read replica is automatically promoted to take over without waiting for the other nodes to reach consensus. All the shared state is in the data nodes, so failed nodes can be replaced almost immediately. Aurora’s unique storage model also facilitates continuous backups and restores with a very low recovery point objective (RPO). In addition, with Aurora Global Database you can expand your database in up to 5 regions for to serve low-latency reads closer to your end customers, with up to 15 read replicas supported in each region.
For the storage layer, Aurora uses a distributed and shared storage architecture that is an important factor in performance, scalability, and reliability of Aurora clusters. You don’t need to predict and provision the storage needed for uninterrupted business operations – Aurora’s cluster storage volumes automatically grow (up to 128 TiB per cluster) or shrink to match your data capacity requirements. Read replicas can be added quickly for additional read-scaling, because all replicas use the same shared storage, and a new replica can start handling queries very quickly without having to replicate data from the other nodes. It’s worth noting that data transferred between AZs for database cluster replication is free. Aurora offers fast recovery from failures. If a writer fails, a read replica is automatically promoted to take over without waiting for the other nodes to reach consensus. All the shared state is in the data nodes, so failed nodes can be replaced almost immediately. Aurora’s unique storage model also facilitates continuous backups and restores with a very low recovery point objective (RPO). Another key capability of our storage architecture is Aurora Global Database, with which a single database can span multiple regions for faster local reads and disaster recovery.
Q5. You mentioned Aurora has differentiating feature like serverless. Please tell us about Aurora Serverless v2, which launched three months ago.
Colin: With Amazon Aurora Serverless v2, you can deploy Aurora on-demand with auto-scaling where the database automatically scales capacity up or down based on your application’s needs. With Aurora Serverless v2, customers can benefit from faster and granular scaling. This is particularly useful if you have spiky, intermittent, or unpredictable workloads. You pay on a per-second basis for the database capacity that you use when the database is active, and migrate between standard and serverless configurations with a few steps.
You can run your database in the cloud without managing any database instances. Manually managing database capacity can take up valuable time and can lead to inefficient use of database resources. With Aurora Serverless v2, you create a database, specify the desired database capacity range, and connect your applications.
For each of your Aurora database clusters, you can choose any combination of Aurora Serverless v2 capacity, provisioned capacity, or a mixed configuration consisting of both. For example, suppose that you need more read/write capacity than is available for an Aurora Serverless v2 writer, you can set up the cluster with a very large provisioned writer and still use Aurora Serverless v2 for the readers. Or suppose the write workload for your cluster varies but the read workload is steady. In this case, you can set up your cluster with an Aurora Serverless v2 writer and one or more provisioned readers.
Q6. Another differentiating feature is Aurora Global Database. Please tell us how Aurora Global Database simplifies the management of cross-regional MySQL and PostgreSQL databases?
Pravin: Amazon Aurora Global Database is an important feature that automates the management of global databases and empowers organizations to build globally distributed applications. With Global Database, a single Aurora database can span multiple AWS Regions. It replicates your data with no impact on database performance, enables fast local reads with low latency in each Region, and provides disaster recovery from region-wide outages. You can also leverage any mix of provisioned or Aurora Serverless v2 for your Global Databases in other regions.
Q7. Please share the differentiating benefits of Aurora Parallel Query available on Aurora MySQL engine.
Colin: Amazon Aurora Parallel Query provides faster analytical queries over your live data, without having to copy the data into a separate system. It can speed up queries by up to two orders of magnitude, while maintaining high throughput for your core transactional workload.
While some databases can parallelize query processing across CPUs in one or a handful of servers, Parallel Query takes advantage of Aurora’s unique architecture to push down and parallelize query processing across thousands of CPUs in the Aurora storage layer. By offloading analytical query processing to the Aurora storage layer, Parallel Query reduces network, CPU, and memory contention with the transactional workload.
Aurora Parallel Query is the right fit if your query involves current data. If you’re querying large volumes of historical data or if you want to integrate data from multiple sources, Amazon Redshift, our petabyte-scale data warehousing service, is a better fit.
Q8. How does Aurora help with machine learning workloads?
Pravin: Amazon Aurora machine learning (Aurora ML) enables you to add ML-based predictions to applications via SQL. When you run an ML query, Aurora calls SageMaker for a wide variety of ML algorithms or Comprehend for sentiment analysis, so your application does not have to call these services directly. This makes Aurora machine learning suitable for low-latency, real-time use cases such as fraud detection, ad targeting, and product recommendations. For example, you can build product recommendation systems by writing SQL queries in Aurora that pass customer profile, shopping history, and product catalog data to a SageMaker model, and get product recommendations returned as query results. Using this integration, customers can build smart applications that provide analytics in the context of the user’s workflow.
Q9. How does Amazon Aurora integrate with other AWS services?
Pravin: In addition to what I shared on Aurora ML integrations with SageMaker and Comprehend, Aurora also integrates seamlessly with many other AWS services. As an example, you can load data into Aurora MySQL or PostgreSQL from Amazon S3 with a few clicks. Similarly, you can export from Aurora MySQL or PostgreSQL into S3. Both Aurora MySQL and PostgreSQL integrate with AWS Lambda for event based architectures and Lambda function invocation. Customers can also use AWS Backup as a centralized solution to plan and orchestrate their backups for a number of AWS services, including Amazon Aurora databases. With CloudWatch Logs, you can monitor, store, and query your log files. Another way to monitor your databases for performance bottlenecks and operational issues is DevOps Guru for RDS, which uses machine learning to detect performance bottlenecks and operational issues, immediately notifies you, provides diagnostic information and intelligent recommendations. There are many other examples.
Connecting your applications to Aurora is equally simple. You can connect to an Aurora database cluster using the same tools that you use to connect to a MySQL or PostgreSQL database. You specify a connection string with any script, utility, or application that connects to a MySQL or PostgreSQL database instance. You use the same public key for Secure Sockets Layer (SSL) connections.
In the connection string, you typically use the host and port information from special endpoints associated with the database cluster. With these endpoints, you can use the same connection parameters regardless of how many database instances are in the cluster. You also use the host and port information from a specific database instance in your Aurora database cluster for specialized tasks, such as troubleshooting.
Q10. Can you tell us how customers are using Amazon Aurora?
Colin: Aurora’s unprecedented growth is driven by strong interest across various industries. In the last twelve months we’ve seen a strong uptake from Financial Services, Software and Internet, Entertainment and Games, and Retail verticals.
Customers move to Aurora MySQL or Aurora PostgreSQL primarily to consolidate their MySQL or PostgreSQL workloads. Additionally, we see a large number of migrations from commercial legacy database workloads, like Oracle and Microsoft SQL Server, to Aurora PostgreSQL. These “break free” customers are tired of paying exorbitant licensing fees for their legacy databases, and the lock-in associated with these databases. High growth customers who move to Aurora want the ability to scale easily and seamlessly – in and across regions, and want to easily integrate across AWS services.
You can access specific customer examples across various industries here.
Q11. What resources are available to get started?
Pravin: You can find a variety of resources in the Aurora resources page and documentation pages. You can learn more about Aurora, including details on the Aurora architecture, and how to get started. This deep dive on Aurora from re:Invent 2021 and this deep dive on Aurora Global Database are also good resources.
Also, Amazon Aurora specialists are available to answer questions and provide support. Contact Us and you’ll hear back from us in one business day to discuss how AWS can help your organization. For those looking to migrate, AWS offers Optimization and Licensing Assessment (OLA) to help you evaluate options to migrate to the cloud – sign up here. Other resources include the AWS Database Migration Service (AWS DMS), a self-serve tool for migrating the database code objects, including views, stored procedures, and functions, with minimal downtime.
Depending on your migration needs, we offer other programs and services, ranging from AWS Professional Serviceswhich taps into the deep expertise of tenured professional for migration assistance to Database Migration Accelerator(DMA), where for a fixed fee, a team of AWS professionals handles the conversion of both the database and application for you.
Q12. Anything else you wish to add?
Colin: From an app developer perspective, Aurora functionally behaves like a MySQL or PostgreSQL database. Under the covers Aurora simplifies many enterprise requirements, provides better durability, scale and performance, and has a serverless offering for dynamically changing workloads. What we have today with Aurora is the result of an ongoing, focused investment strategy to establish and retain Aurora’s position as the most feature rich and cost-effective relational database. We have unparalleled experience with operating the largest fleet of databases, and have used this experience to improve the quality of our services. It’s part of our commitment to innovation.
At AWS, we have an unwavering commitment to innovation. In the world of databases, we will continue to increase the breadth and depth of our portfolio. Our roadmap has been, and will continue to be, driven by customer requests. It’s how we stay grounded in solving real-world problems. We will continue to raise our value line, so that you can focus on your digital transformation initiatives, and leave the IT infrastructure to us. Customers choose us for the trust we have earned with a proven track record of delivering on customer requirements.
A good way to come up to speed with the latest from AWS and the art of the possible is to attend re:Invent. At this conference, we showcase product announcements and a breadth of sessions that cover our product roadmap and best practices. It will have the highest concentration of AWS experts in one place, and it is a great opportunity for you to deepen your own expertise.
………………………………………
Colin P. Mahony, General Manager, Aurora MySQL
Colin currently leads the Aurora MySQL business for AWS and is responsible for the daily operations, strategy, planning, positioning and development of the business. Amazon Aurora MySQL is a MySQL compatible relational database built for the cloud that combines the performance and availability of traditional enterprise databases with the simplicity and cost-effectiveness of open source databases. Amazon Aurora is up to five times faster than standard MySQL and it provides the security, availability, and reliability of commercial databases at 1/10th the cost. Amazon Aurora is fully managed by Amazon Relational Database Service (RDS), which automates time-consuming administration tasks like hardware provisioning, database setup, patching, and backups.
Colin joined AWS in 2021 and prior to that he was the General Manager of the Vertica at Micro Focus via its acquisition of HPE Software in 2017. Colin ran Vertica as a stand-alone business and was responsible for all aspects of it as a CEO equivalent. As a venture capitalist, Colin invested in and helped launch Vertica. He then liked the data business so much he joined Vertica full time. Vertica put columnar technology on the map as mainstream, and in 2011, Colin was instrumental in driving the acquisition of Vertica by Hewlett-Packard. Colin joined HP as part of the acquisition, and took on the responsibility of VP and General Manager for HP Vertica, where he guided the business to remarkable annual growth and recognized industry leadership. Colin then ran HP’s Big Data Platform business which combined Vertica with other assets like Hadoop and IDOL for unstructured search and analytics. During his final years there, Colin led Vertica through its pivot and reorientation towards modern cloud environments and advanced machine learning as well as subscription and SaaS. Colin brings a unique combination of technical leadership experience, sales and go to market expertise, market intelligence, customer relationships, and strategic partnering.
Prior to Vertica, Colin was a Vice President at Bessemer Venture Partners focused on investments primarily inenterprise software, telecommunications, and digital media. He established a great network and reputation forassisting in the creation and ongoing operations of companies through his knowledge of technology, markets andgeneral management in both small startups and larger companies. Prior to Bessemer, Colin worked at LazardTechnology Partners in a similar venture capitalist investor capacity. Prior to his venture capital experience, Colin was a Senior Analyst at the Yankee Group serving as an industry analyst and consultant covering databases, BI,middleware, application servers and ERP systems. Colin helped build the ERP and Internet Computing Strategies practice at Yankee in the late nineties.
Colin earned an M.B.A. from Harvard Business School and a bachelor’s degrees in Economics with a minor inComputer Science from Georgetown University. He has served on the public board of Datawatch and saw it through its acquisition by Altair in 2018. Colin lives just outside of Boston and is active in his community having served on the boards of Big Brothers Big Sisters of Massachusetts Bay and Year Up Boston. Colin is also active with the JoeyFund for Cystic Fibrosis.
………………………………………
Pravin Mittal, General Manager in RDS Aurora
As the General Manager in RDS Aurora, Pravin currently leads the Aurora PostgreSQL business for AWS and is responsible for the daily operations, strategy, planning, positioning and development of the business. Amazon Aurora PostgreSQL is a PostgreSQL compatible relational database built for the cloud that combines the performance and availability of traditional enterprise databases with the simplicity and cost-effectiveness of open source databases. Amazon Aurora is up to five times faster than standard PostgreSQL and it provides the security, availability, and reliability of commercial databases at 1/10th the cost. Amazon Aurora is fully managed by Amazon Relational Database Service (RDS), which automates time-consuming administration tasks like hardware provisioning, database setup, patching, and backups. Previous to his role with Aurora, Pravin was General Manager for Amazon Timestream and Head of engineering for Amazon DocumentDB. He has been at AWS for 5-years, and during this time, he has built, and launch both these services! Pravin has over 20 years of experience in operating systems, computer architecture, and databases. Before joining AWS, Pravin spent more than 15-years as an engineering leader working on Microsoft SQL Server, Azure SQLDB, Hekaton (In-memory database) and HDInsight (BigData platform).
Pravin has M.B.A. and Master in Computer Science and Engineering from University of Washington, Seattle, and Bachelor in Computer Science from University of Wisconsin, Madison.
…………………………..
Sponsored by AWS.