Simple AWS
Posts
Is Amazon Aurora Serverless Actually Serverless?

Is Amazon Aurora Serverless Actually Serverless?

Guille Ojeda
July 27, 2024

Serverless promised us that we could focus on writing code without worrying about the underlying infrastructure. Services like AWS Lambda, AWS Fargate and Amazon DynamoDB in On Demand mode have shown us that that isn't completely true, since we still need to understand the underlying details.

But even a partial fulfillment on that promise brings us significant benefits. Yes, there are still servers. And yes, we still need to know the fine details of how they work. But these services are still considered serverless because they automatically scale to meet demand, require no infrastructure management (yes, you need awareness and knowledge, but no management), and charge based on actual usage.

For relational databases AWS gave us Amazon Aurora Serverless, in an attempt to bring the serverless paradigm to MySQL and PostgreSQL. It promises automatic scaling, pay-per-use pricing, and zero infrastructure management. And it partially delivers on those promises, to the point of being a very useful tool. But is it truly serverless?

In this article we'll dive into the architecture, scaling mechanics, and performance characteristics of Amazon Aurora Serverless v2 (because v1 is deprecated). We'll compare it to traditional Amazon Aurora and other serverless AWS services, analyze its usefulness, and evaluate whether we can consider it actually serverless.

Understanding Aurora Serverless

Before we start questioning its serverlessness, let's do a quick review of Aurora Serverless. Essentially, Aurora Serverless is an instance type of Amazon Aurora (I know, I was also surprised when I found out!) that automatically starts, stops, and scales capacity up or down based on its usage. It's compatible with both MySQL and PostgreSQL, exactly like Aurora.

The key features that set Aurora Serverless apart from traditional (i.e. Provisioned) Aurora are:

Automatic scaling: The instance adjusts its size in fine-grained increments, measured in Aurora Capacity Units (ACUs).
Per-second billing: You're only charged for the Aurora Capacity Units you provision (i.e. the size of the instance), with a minimum of 1 minute of usage. Fun fact: Provisioned Aurora also has per-second billing, the only difference is the minimum of 10 minutes. I only included this as a "difference" to highlight this fact.

The pricing model for Aurora Serverless is Aurora Capacity Units (ACUs) consumed per second, plus storage and I/O costs. This is typically portrayed as in contrast to traditional Aurora, where you pay for the Amazon RDS instances running in your database cluster, regardless of utilization. Aurora Serverless in this sense is similar to Fargate: You don't actually pay per usage, you still pay for provisioned capacity. But the capacity can be provisioned in much smaller increments and can be added to much faster, so your unused capacity will almost surely be much lower. Not exactly pay per use, but a lot closer than with regular Aurora.

Aurora Serverless v1 vs v2

Aurora Serverless has gone through two major iterations: v1 and v2. Aurora Serverless v1 is now considered unofficially deprecated, and is being silently phased out. The documentation hasn't been updated yet (as of this writing, July 2024) to reflect this, but AWS Solutions Architects are actively recommending against using v1. Still, I figured a bit of history would help you understand some design decisions.

Aurora Serverless v1, released in 2018, was the first attempt at bringing serverless capabilities to Aurora. It introduced the concept of ACUs and automatic scaling but had several limitations:

Scaling operations could take 20-40 seconds, during which the database was unavailable for writes and might refuse connections.
It didn't support features like multi-AZ deployments, read replicas, or Global Database.
The minimum capacity was 1 ACU (2 GB RAM), which was overkill for very small workloads or dev environments.

Aurora Serverless v2, released in 2022, addressed many of these limitations:

Scaling is much faster and doesn't cause connection interruptions.
It supports multi-AZ deployments, read replicas, and Global Database.
The minimum capacity is 0.5 ACU (1 GB RAM), allowing for finer-grained scaling.
It can scale up to 128 ACUs (256 GB RAM) per instance, compared to v1's limit of 32 ACUs.

V2 brought a lot of improvements, but we lost one critical feature: auto-pause. This means v2 instances are always running, as opposed to pausing after periods of no traffic and going through a cold start when traffic restarts, like v1 did. Do you see now why I question how "serverless" it is? Anyways, let's dive deeper.

The Architecture Behind Aurora Serverless

Aurora Serverless v2 is architected to be "instantly scalable". It obviously uses shared infrastructure behind the scenes (as anything serverless, or anything in the cloud for that matter), but it provides the same degree of security and isolation as provisioned Aurora instances. The dynamic scaling mechanism has very little overhead, allowing it to respond pretty quickly to demand changes.

An important thing to point out is that Aurora Serverless doesn't get you out of the main limitation of Aurora: You have one writer instance and zero to multiple reader instances. These instances can scale vertically independently, but the "instant" scalability is vertical, not horizontal. And you can't have multiple writers (however, remember that readers act as failover replicas).

Aurora Serverless v2 Capacity

As I mentioned before, the unit of measure for Aurora Serverless v2 is the Aurora Capacity Unit (ACU). Each ACU consists of 2 gibibytes (GiB) of memory, "corresponding" CPU (AWS's words, not mine!), and networking. Aurora Serverless v2 capacity isn't tied to the DB instance classes used for provisioned clusters.

At any given moment each Aurora Serverless v2 DB writer or reader instance has a value of capacity in ACUs, with a minimum of 0.5 ACUs and a maximum of 128 ACUs. This capacity value increases or decreases when the instance scales, in steps of 0.5 ACUs. For each Aurora Serverless v2 DB cluster, you define a capacity range by specifying minimum and maximum capacity values that each instance can have. Each instance will have its own value for capacity, but it will fall within this range.

Aurora Serverless v2 Scaling

Aurora continuously tracks the utilization of CPU, memory and network for each Aurora Serverless v2 instance. These measurements, collectively called the load, include both database operations performed by your application and background processing for the database server and Aurora administrative tasks.

The minimum capacity increment is 0.5 ACUs. However, scaling will happen at larger increments if the current capacity is large. Scaling can happen while database connections are open, SQL transactions are in process, tables are locked, and temporary tables are in use. It doesn't disrupt any of those operations.

In a Multi-AZ DB cluster, you can choose whether readers scale at the same time as the writer instance, or independently. This is determined by the promotion tier specified for each reader:

Readers in promotion tiers 0 and 1 scale at the same time as the writer, making them suitable as failover replicas since they'll always have the same size of the writer.
Readers in promotion tiers 2–15 scale independently from the writer, remaining within the specified minimum and maximum ACU values.

Storage Architecture

The storage architecture for Aurora Serverless v2 is identical to that of provisioned Aurora clusters. Each Aurora DB cluster's storage consists of six copies of all data, spread across three Availability Zones. This built-in data replication applies regardless of whether your DB cluster includes any readers in addition to the writer.

The storage system is distributed, fault-tolerant, and self-healing, automatically scaling up to 128 TB per database instance. Importantly, storage capacity and compute capacity are separate. When we refer to Aurora Serverless v2 capacity and scaling, we're always talking about compute capacity (which in this case is a blanket term that includes CPU, memory and network).

High Availability Architecture

Aurora Serverless v2 supports Multi-AZ deployments for high availability. You can add up to 15 Aurora Serverless v2 reader instances spread across 3 AZs to an Aurora DB cluster. This works exactly like Provisioned Aurora clusters. In fact, you don't even save money on the failover instance 😢: Reader instances in promotion tiers 0 and 1 will scale together with the writer instance, to ensure that they have the necessary capacity in the event of a failure of the writer instance and a failover to these instances.

For business-critical applications that must remain available even in case of an issue affecting your entire cluster or AWS Region, you can set up an Aurora global database with Aurora Serverless v2 capacity in the secondary clusters. In this case your secondary cluster isn't forced to scale with the primary one, it can be kept at a lower capacity setting and let it auto scale when the regional failover happens, which is called a Warm Standby disaster recovery strategy.

Configuration Parameters

Aurora Serverless v2 allows you to adjust all the same cluster and database configuration parameters as provisioned DB clusters. However, some capacity-related parameters are handled differently:

Some parameters are automatically adjusted during scaling. For example, the amount of memory reserved for the buffer cache increases as a writer or reader scales up and decreases as it scales down.
Some parameters are kept at fixed values that depend on the maximum capacity setting. For instance, Aurora automatically sets the maximum number of connections to a value appropriate for the maximum capacity setting.

Monitoring

Monitoring Aurora Serverless v2 involves measuring the capacity values for the writer and readers in your DB cluster over time. Key metrics to watch include ServerlessDatabaseCapacity and ACUUtilization.

The charges for Aurora Serverless v2 capacity are measured in terms of ACU-hours. If the total number of writers and readers in your cluster is N, the cluster consumes approximately N x minimum ACUs when not running any database operations, and no more than N x maximum ACUs when the serverless database is running at full capacity.

Scaling in Aurora Serverless

Aurora continuously tracks the utilization of resources such as CPU, memory, and network for each Aurora Serverless v2 writer or reader instance. These measurements, collectively called the load, include both database operations performed by your application and background processing for the database server and Aurora administrative tasks.

When Aurora detects that the current capacity is constrained by any of these resources, it initiates a scaling operation. Scaling can occur in increments as small as 0.5 ACUs. The scaling rate depends on the current capacity: the higher the current capacity, the larger the scaling increment.

Unlike Aurora Serverless v1, which scales by doubling capacity each time the DB cluster reaches a threshold (which is suspiciously similar to DynamoDB On Demand), Aurora Serverless v2 can increase capacity incrementally. When your workload demand begins to approach the current database capacity of a writer or reader, Aurora Serverless v2 increases the number of ACUs for that writer or reader in the increments required to provide the best performance for the resources consumed.

Scaling Speed and Behavior

Scaling can happen while database connections are open, SQL transactions are in process, tables are locked, and temporary tables are in use. Aurora Serverless v2 doesn't wait for a quiet point to begin scaling, and scaling doesn't disrupt any database operations that are underway.

A bit surprisingly, most scaling events keep the writer or reader on the same host, further minimizing potential disruptions. In the rare cases that an Aurora Serverless v2 writer or reader is moved from one host to another, Aurora Serverless v2 manages the connections automatically.

Reader Scaling Behavior

In a Multi-AZ DB cluster you have control over how readers scale in relation to the writer. This is determined by the promotion tier specified for each reader:

Readers in promotion tiers 0 and 1 scale at the same time as the writer. This keeps them at the necessary size to take over the workload from the writer in case of failover, making your cluster highly available.
Readers in promotion tiers 2–15 scale independently from the writer, remaining within the minimum and maximum ACU values specified for the cluster.

Scaling Limitations

Aurora Serverless v2 offers significant improvements over v1, and of course it's much better at scaling than Aurora Provisioned instances. However, Aurora Serverless v2 writers and readers don't scale all the way down to zero ACUs (Aurora Serverless v1 did, v2 doesn't). Idle instances scale down to the minimum ACU value specified for the cluster, which can be as low as 0.5 ACUs, but not 0.

On the other end, the maximum ACU value of 128 might not be sufficient for extremely large workloads, which Aurora Provisioned instances may be able to handle. Granted, at those sizes everything is even more "it depends" than usual, but it's still worth mentioning.

Additionally, and perhaps more importantly, scaling is fast, but it's not instantaneous (despite anything that the AWS documentation says). There can still be a brief delay between when additional capacity is needed and when it becomes available.

And while we're on touchy subjects... scaling events can introduce latency. That was very evident in v1, and has been significantly improved in v2, but you may still see a brief spike in response times during scaling.

Amazon Aurora Serverless Pricing

Each ACU-hour is priced at $0.12 ($0.16 for I/O-Optimized). For reference, a db.t4g.medium instance (equivalent to 2 ACUs, or $0.24/hour) costs $0.073/hour ($0.095 for I/O-Optimized). This makes Aurora Serverless over 3x the price of provisioned Aurora.

Aurora Serverless v2 charges based on ACU-seconds consumed, with a minimum billable duration of 1 minute (60 ACU-seconds). This is certainly more granular than paying for full RDS instances, but it's still paying for provisioned capacity, not actual usage. In that sense it looks a lot more similar to AWS Fargate than to, for example, Amazon DynamoDB on-demand.

Additionally, there are some extra costs:

Database storage costs (which are the same as in Aurora Provisioned, and are priced separately from capacity, just like in Aurora Provisioned)
Data transfer costs if you're accessing the database from outside its region
Snapshot and backup storage costs (again, same as Aurora Provisioned)

Aurora Serverless v2 vs Aurora Provisioned

Here's a feature comparison table between Aurora Serverless v2 and provisioned Aurora:

Feature	Aurora Serverless v2	Provisioned Aurora
Multi-AZ	Yes	Yes
Read Replicas	Yes	Yes
Global Database	Yes (with limitations)	Yes
Performance Insights	No	Yes
Auto Scaling	Yes (built-in)	Yes (requires configuration)
Instance Size Flexibility	0.5 - 128 ACUs	Fixed instance sizes
Auto Pause	No (v1 had this! v2 doesn't)	No

Hybrid and Multi-Region Approaches with Aurora Serverless v2

Aurora Serverless v2 offers flexible deployment options that allow you to combine serverless instances with provisioned instances in the same cluster, as well as leverage Aurora Global Databases for Disaster Recovery.

Hybrid Clusters: Combining Provisioned and Aurora Serverless v2 Instances

Aurora allows you to create clusters that contain both provisioned and Aurora Serverless v2 DB instances. When adding instances to a cluster, you can choose between provisioned instances (with specific instance classes like db.r5 or db.r6g) and Aurora Serverless v2 instances ("Serverless" in the console or "db.serverless" in the CLI/API).

Hybrid clusters are useful for workloads where reads are very variable and writes are predictable. You can use a provisioned instance as the primary instance of the cluster, and Serverless read replicas to handle the variable read operations. If your write operations aren't that predictable, you can use SQS to throttle them.

Important thing to keep in mind: Promotion tiers. Tiers 0 and 1 scale along with the writer instance, and tiers 2-15 scale independently based on their own workload. So, realistically you'll have 2 read replicas at the same size as your provisioned primary instance, and from the third replica they'll scale independently (either up or down).

Unfortunately, not all features available for provisioned instances are supported for Aurora Serverless v2 instances. For example, Aurora Serverless v2 doesn't support database activity streams.

Multi-Region Disaster Recovery with Aurora Global Database and Aurora Serverless

When you're using Aurora Global Database to implement a Disaster Recovery strategy, you can use Aurora Serverless instances in your secondary cluster. This way you maintain a primary cluster in one region always scaled to the size you need, and one or more read-only secondary clusters in other regions which will be auto scaled by Aurora Serverless.

The biggest benefit is that you're not constrained by the size of the instances in the primary cluster. You can keep your secondary cluster at a smaller size, and let it scale up when needed. Keep in mind that this will increase your time to recovery, since the cluster will need a bit of time to scale up to the necessary size before it can serve your full production traffic.

And please monitor replication lag between the primary and secondary clusters, especially if using Aurora Serverless v2 instances with low minimum capacity settings.

Conclusion

So, is Amazon Aurora Serverless truly serverless? The answer, as with many things in technology, is: it depends on your definition.

If we define "serverless" strictly as:

No infrastructure management
Automatic scaling
Pay-per-use pricing

Then Aurora Serverless v2 mostly fits the bill, with some caveats:

While you don't manage individual instances, you're still dealing with a database cluster that doesn't completely abstract away the underlying infrastructure.
Scaling is automatic but not instantaneous, and there's still a concept of capacity units (ACUs).
Pricing is more granular than traditional instances, but not as fine-grained as truly serverless offerings like DynamoDB on-demand.

If we compare it to AWS's other serverless offerings, Aurora Serverless primarily falls short in that it doesn't scale to zero like Lambda or Fargate.

However, it's important to remember that relational databases have different constraints and requirements compared to stateless compute or NoSQL databases. Given these constraints, Aurora Serverless v2 is pretty impressive, pretty useful, and pretty serverless.

Did you like this issue?

Loved it! 💖 | It was good 🙂 | No bueno 😑

Reply

or to participate.