How to Stop AWS Lambda from Melting When 100K Requests Hit at Once

The Problem

You’ve built a nice little API using AWS API Gateway - serving as a web-hook endpoint for an external Order Management System. It takes orders from your web app, saves them to a database, and calls a few downstream services. It runs through this stack: API Gateway → Lambda → Database.

It works perfectly on a normal day.

Then one day…

Your marketing team runs a big sale.
Or a partner’s integration script gets stuck in a loop.
Or a new feature upstream pushes thousands of requests by mistake.

Suddenly you’re facing an event storm - a huge, unexpected spike in requests hitting your API all at once.

What’s an Event Storm?

An event storm is when the number of incoming requests in a short time is way higher than your system normally handles. It’s not a slow increase. It’s a flood.

Example triggers:

Black Friday or Diwali sale on your e-commerce site.
A cron job gone rogue sending thousands of API calls per second.
A botnet hammering your public endpoint.
A sudden surge in user signups after media coverage.

Quick note on Lambda scaling: Lambda doesn’t go from 0 → 100K concurrent executions instantly. Each region has a burst concurrency limit (for example ~3000 in us-east-1) and then scales up by about 500–1000 per minute. So when I say “100K requests hit,” I mean the inbound spike hitting API Gateway, not the actual concurrency Lambda reached. The scaling ramp-up actually makes the need for a buffer more obvious -SQS can absorb the full burst immediately, while Lambda drains at the pace it can realistically scale to.

Why It’s a Problem in AWS Lambda

AWS Lambda scales fast, which sounds great until you realise:

Every Lambda execution counts toward your account concurrency limit.
If you hit that limit, all other Lambdas in your account may start failing too.
Downstream systems (like a database) may crash from too many parallel requests.

With API Gateway directly invoking Lambda, an event storm can burn through concurrency in seconds.

The Safer Approach

We’ll redesign the flow so that:

API Gateway sends requests to SQS instead of Lambda directly.
If SQS accepts it, API Gateway returns 202 Accepted with an ack id (SQS MessageId).

HTTP/1.1 202 Accepted{"status":"accepted","messageId":"c1c7...","receivedAt":1735296623123}

Processing happens later via the consumer Lambda.
Lambda reads from SQS in controlled batches and upserts into database
Concurrency is capped so we never overload the database.
Failed messages go to a DLQ for later reprocessing.

This way, the storm is buffered in SQS, and you can process at a steady pace without losing data.

Architecture Overview

API Gateway -> receives incoming HTTP requests.
SQS (Standard Queue) -> buffers all requests.
Lambda Consumer -> pulls messages from SQS in small batches.
DLQ -> stores failed messages for later replay.

Don’t forget about caching at the API layer. If your API is serving repetitive data (like catalog lookups, product info, or configuration files), you can enable API Gateway caching. This stores responses in an in-memory cache so repeated requests don’t even reach Lambda or SQS. In my case the requests were mostly unique and user-specific, so caching didn’t help much. But for many APIs, caching at API GW can drastically reduce Lambda concurrency and costs.

Step 0: Recommended Starting Settings

You can tweak these later, but start here:

Lambda timeout: 30s
Lambda reserved concurrency (the max number of concurrent executions this function can ever use): 50
SQS batch size: 10
Max batching window (how long Lambda waits to gather more records before invoking the function): 2s
SQS visibility timeout (the time a message stays hidden after being picked up, before it can be retried): 90s
DLQ maxReceiveCount (the number of times a message can fail before being sent to the DLQ): 5

Why these help:

Max parallel work = 50 (Lambda reserved concurrency**)** × 10 (SQS batch size**)** = 500 messages at once.
SQS buffers the rest during a spike.
Visibility timeout prevents duplicate processing.
DLQ ensures nothing is lost.

Step 1: Create SQS and DLQ

Create DLQ

aws sqs create-queue --queue-name orders-dlq

Create main queue with DLQ attached

DLQ_URL=$(aws sqs get-queue-url --queue-name orders-dlq --query 'QueueUrl' --output text)DLQ_ARN=$(aws sqs get-queue-attributes --queue-url $DLQ_URL --attribute-names QueueArn --query 'Attributes.QueueArn' --output text)aws sqs create-queue \  --queue-name orders \  --attributes '{    "RedrivePolicy":"{\"deadLetterTargetArn\":\"'"$DLQ_ARN"'\",\"maxReceiveCount\":\"5\"}",    "VisibilityTimeout":"90"  }'

Step 2: Connect API Gateway to SQS

Option A: Direct integration

No Lambda in between.
API Gateway sends request body directly to SQS.

IAM role for API Gateway

{  "Version": "2012-10-17",  "Statement": [{    "Effect": "Allow",    "Action": ["sqs:SendMessage"],    "Resource": "arn:aws:sqs:<REGION>:<ACCOUNT_ID>:orders"  }]}

Option B: Producer Lambda

If you need validation/auth before pushing to SQS:

// producer/index.mjsimport { SQSClient, SendMessageCommand } from "@aws-sdk/client-sqs";const sqs = new SQSClient({});const QUEUE_URL = process.env.QUEUE_URL;export const handler = async (event) => {  const body = JSON.parse(event.body || "{}");  if (!body.orderId) {    return { statusCode: 400, body: JSON.stringify({ error: "orderId required" }) };  }  await sqs.send(new SendMessageCommand({    QueueUrl: QUEUE_URL,    MessageBody: JSON.stringify(body)  }));  return { statusCode: 202, body: JSON.stringify({ status: "accepted" }) };};

API Gateway caching reminder: If traffic includes repeated requests, enable API Gateway caching. It cuts load before it even reaches SQS or Lambda.

Step 3: Create the Consumer Lambda

// consumer/index.mjsexport const handler = async (event) => {  for (const record of event.Records) {    try {      const msg = JSON.parse(record.body);      await processOrder(msg);    } catch (err) {      console.error("Failed for messageId", record.messageId, err);      throw err; // forces SQS to retry or send to DLQ    }  }};async function processOrder(msg) {  await new Promise(r => setTimeout(r, 150)); // simulate work}

Event Source Mapping

Batch size: 10
Max batching window: 2s

Reserved concurrency: 50 (prevents overload)

Step 3.1: Don’t kill your database connections

One hidden problem when Lambda talks to RDS: each Lambda instance may open a new DB connection. During a spike, you can exhaust DB connections fast.

The fix is to use RDS Proxy. It sits between Lambda and the database, pools connections, and reuses them across Lambda invocations. This keeps your database safe from connection storms.

Alternatives:

For PostgreSQL/MySQL, some teams also run PgBouncer on ECS/Fargate.
Or, for workloads with unpredictable spikes, consider DynamoDB which doesn’t have connection limits.

Step 4: Tune for Storms

Visibility timeout ≥ 3 × Lambda timeout.
Do throughput math:

parallelism = reserved_concurrency × batch_sizerecords/sec ≈ parallelism / avg_processing_seconds

Step 5: Alerts

Queue depth alarm

aws cloudwatch put-metric-alarm \  --alarm-name "orders-queue-depth-high" \  --metric-name ApproximateNumberOfMessagesVisible \  --namespace AWS/SQS \  --dimensions Name=QueueName,Value=orders \  --statistic Average \  --period 60 \  --threshold 50000 \  --comparison-operator GreaterThanThreshold \  --evaluation-periods 5 \  --alarm-actions <SNS_TOPIC_ARN>

Oldest message age alarm -> Warns you when messages have been waiting too long in the queue (in seconds).

aws cloudwatch put-metric-alarm \  --alarm-name "orders-oldest-message-too-old" \  --namespace "AWS/SQS" \  --metric-name "ApproximateAgeOfOldestMessage" \  --dimensions "Name=QueueName,Value=orders" \  --statistic "Maximum" \  --period 60 \  --threshold 600 \  # warn if oldest message > 10 minutes old  --comparison-operator "GreaterThanThreshold" \  --evaluation-periods 5 \ # for 5 minutes if period=60  --treat-missing-data "notBreaching" \  --unit "Seconds" \  --alarm-actions "<SNS_TOPIC_ARN>"

Step 6: DLQ Replay

Fix the bug → replay messages from DLQ to main queue with a small Lambda.

import { SQSClient, ReceiveMessageCommand, DeleteMessageCommand, SendMessageCommand } from "@aws-sdk/client-sqs";const sqs = new SQSClient({});const DLQ_URL = process.env.DLQ_URL;const MAIN_URL = process.env.MAIN_URL;export const handler = async () => {  const res = await sqs.send(new ReceiveMessageCommand({    QueueUrl: DLQ_URL, MaxNumberOfMessages: 10, WaitTimeSeconds: 1  }));  if (!res.Messages) return;  for (const m of res.Messages) {    await sqs.send(new SendMessageCommand({ QueueUrl: MAIN_URL, MessageBody: m.Body }));    await sqs.send(new DeleteMessageCommand({ QueueUrl: DLQ_URL, ReceiptHandle: m.ReceiptHandle }));  }};

Pick THRESHOLD_SECONDS well below your SQS message retention (default 4 days) but high enough to avoid noise. Start with 600-1800 seconds for most APIs, or align it with your SLA.
Use Maximum (not Average) - you care about the single oldest message.
Pair this with the queue depth alarm you already have.

Important Pointers..

Don’t let Lambda’s scaling run wild in a spike.
Do buffer in SQS and process at a pace your system can handle.
Always set a DLQ - losing data is worse than slow processing.
Monitor queue depth and message age - that’s your early warning.
Use RDS Proxy when Lambdas talk to relational databases.
Add API Gateway caching if traffic is repetitive.

One last note,

This whole design works best for asynchronous APIs-places where you can acknowledge a request and process it later.

But what if your API is synchronous and the client needs a response right away? In that case, queues won’t help. You need other tools like rate limiting, provisioned concurrency, and sometimes containers.

I wrote a follow-up on handling synchronous API traffic spikes here: https://medium.com/aws-in-plain-english/surviving-traffic-surges-in-sync-apis-rate-limits-warm-lambdas-and-smart-scaling-d04488ad94db?sk=6a2f4645f254fd28119b2f5ab263269d

Together, these two posts cover both sides of the coin:

Async APIs → buffer with SQS.
Sync APIs → throttle, pre-warm, or containerize.

☕️ Found This Helpful?

If this guide helped you debug a tricky front-end issue or saved you from a wild goose chase : you can buy me a coffee

A message from our Founder

Hey, Sunil here. I wanted to take a moment to thank you for reading until the end and for being a part of this community.

Did you know that our team run these publications as a volunteer effort to over 3.5m monthly readers? We don’t receive any funding, we do this to support the community. ❤️

If you want to show some love, please take a moment to follow me on LinkedIn, TikTok, Instagram. You can also subscribe to our weekly newsletter.

And before you go, don’t forget to clap and follow the writer️!

How to Stop AWS Lambda from Melting When 100K Requests Hit at Once

The Problem

What’s an Event Storm?

Why It’s a Problem in AWS Lambda

The Safer Approach

Architecture Overview

Step 0: Recommended Starting Settings

Step 1: Create SQS and DLQ

Step 2: Connect API Gateway to SQS

Option A: Direct integration

Option B: Producer Lambda

Step 3.1: Don’t kill your database connections

Step 4: Tune for Storms

Step 5: Alerts

Step 6: DLQ Replay

Important Pointers..

One last note,

☕️ Found This Helpful?

A message from our Founder

Continue Learning

Hands-on CI/CD for Spring Boot applications using GitHub Actions and AWS

How to Push a Docker Image to the AWS ECR

Use an Existing ALB/NLB for Your EKS Cluster

Configuring a Python Environment to Automatically Run on EC2 Instance Startup

How to Deploy a NGINX Server with Kubernetes

Setup AWS CDK In 2 Minutes

Main Menu

Follow Us

How to Stop AWS Lambda from Melting When 100K Requests Hit at Once

The Problem

What’s an Event Storm?

Why It’s a Problem in AWS Lambda

The Safer Approach

Architecture Overview

Step 0: Recommended Starting Settings

Step 1: Create SQS and DLQ

Step 2: Connect API Gateway to SQS

Option A: Direct integration

Option B: Producer Lambda

Step 3.1: Don’t kill your database connections

Step 4: Tune for Storms

Step 5: Alerts

Step 6: DLQ Replay

Important Pointers..

One last note,

☕️ Found This Helpful?

A message from our Founder

Continue Learning

Hands-on CI/CD for Spring Boot applications using GitHub Actions and AWS

How to Push a Docker Image to the AWS ECR

Use an Existing ALB/NLB for Your EKS Cluster

Configuring a Python Environment to Automatically Run on EC2 Instance Startup

How to Deploy a NGINX Server with Kubernetes

Setup AWS CDK In 2 Minutes