Cloud Architecture

Designing a Secure, Resilient and Auto-Scaling Database Tier for your API

April 4, 2023

I recently published a whitepaper and GitHub repository detailing a 3-tier AWS cloud architecture using only serverless services. In a series of blog post I will go over each tier to describe the reasoning and decisions behind the design.

In a previous article we looked at a simplified version of the stack. In today's post we will take a closer look at the database tier. For those of you wanting to dive into the details straight away, a detailed diagram of the architecture can be found on GitHub here.

Requirements

When I wrote the requirements for the aforementioned architecture, and especially for the API part, I concluded I wanted it to be highly available, resilient, scalable and secure. I found a couple of great AWS patterns that allowed me to check all of those boxes. Let's have a look.

Serverless API

Lambda is probably one of the most used serverless AWS Services. A Lambda function is ideal for handling the communication between your API endpoint and your database tier. For example it provides an efficient use of compute power. It only uses CPU when actual API requests come in. It also deeply integrates with API Gateway, RDS Databases and your VPC, allowing for maximum security. If you currently have a server-based API running on a framework like Django or Node, I suggest you take a look at the Serverless Framework. That framework offers convenient tools to transform your "server-based" API into a "serverless" API that can run in Lambda. It also allows you to run your Lambda locally as part of your development process.

Database Connection Pool

As mentioned, your Lambda will take care of compute scaling. It does so by automatically spinning up additional Lambdas in the background as needed. A bit like shown in the diagram below. The challenge is that each of these Lambda instances will need a connection to the database. While there are best practices for limiting the number of connections, your database's connection pool might still overload, or your Lambda might time out, resulting in 50X errors.

How Lambda Scales

RDS Proxy to the Rescue

To eliminate this problem, AWS introduced a service called RDS Proxy. The proxy sits in between your Lambda function and RDS instance and takes care of the connection pool. As AWS puts it "Using RDS Proxy, you can handle unpredictable surges in database traffic. Otherwise, these surges might cause issues due to oversubscribing connections or creating new connections at a fast rate. RDS Proxy establishes a database connection pool and reuses connections in this pool. This approach avoids the memory and CPU overhead of opening a new database connection each time."

Database Authentication

An additional perk is that RDS Proxy can also take care of the database authentication for you. Using IAM permissions, you authorize Lambda to connect to RDS Proxy which in turn accesses a Secrets Manager resource hosting the database credentials. As a result you can reduce the overhead to process credentials and establish a secure connection for each new connection. RDS Proxy can handle some of that work on behalf of the database.

RDS proxy (image by AWS)

Scalable, Serverless Database

In this architecture I'm using an Aurora Cluster with Auto Scaling. Aurora Auto Scaling has been designed to meet both connectivity and workload needs by adjusting the number of Aurora Replicas allocated to an Aurora DB cluster. This is available for both Aurora MySQL and Aurora PostgreSQL. Not only can it cope with an influx of traffic or workload, it can also remove any unnecessary Replicas when they are no longer needed - thereby reducing costs associated with unused DB instances. Since it also comes as a serverless version this was an obvious choice for me. While it doesn't fully scale down to zero when there's no traffic, it's still far more affordable than its non serverless variant or a traditional RDS instance.

Putting it all together

The following diagram shows how this all works together. The Lambda function is assigned a role that can access RDS Proxy. The RDS Proxy is assigned a role to access Secrets Manager. Secrets manager stores the database credentials. RDS Proxy connects to the database and handles the connections to the Lambdas. Aurora scales as needed when the workload increases. In this example we also rotate our secrets using an additional Lambda. Definitely not a must-have, but a best practice.

Our Serverless database tier

Continued reading

If you would like to learn more about this stack, please have a look at the whitepaper and GitHub repository.

About the author
Support

I'm an AWS certified cloud architect from New York, who loves writing about DevSecOps, Infrastructure as Code and Serverless. Having run a tech company myself for years, I love helping other start-up scale using the latest cloud services.

Join my mailing list

Stay up to date with everything Skripted.

Sign up for periodic updates on #IaC techniques, interesting AWS services and serverless.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.