Skip to content

Why is my AWS bill so high?

If you've ever opened the AWS console at the end of the month and thought "we don't even do that much, where is this number coming from?" — you're in the right place. AWS is the easiest cloud to spend money on and the hardest to understand the spend of. This guide walks through the five line items that dominate a typical startup invoice.

Estimate your waste with the Cloud Waste Radar →


TL;DR — the usual suspects

For a typical Series-A SaaS company spending $10k–$50k/month on AWS, the breakdown is almost always:

Line item Typical share Where the surprise lives
EC2 / EKS compute 40–55% Idle/oversized instances, low-utilisation nodes, extra clusters
Data transfer 10–20% Cross-AZ, NAT Gateway, egress to the internet
EBS storage 5–10% Unattached volumes, old snapshots, gp2 that should be gp3
RDS 5–15% Right-sized once a year, never again
Other (S3, CloudWatch, etc.) 10–20% CloudWatch Logs retention, S3 versioning, NoOps spend

Each of these has a calculator at tools.getfinops.cloud and a guide here.


1. Compute (EC2 + EKS) — the biggest line, the easiest to ignore

Why it grows: instances are easy to launch and there is no native AWS pressure to delete them. Auto Scaling Groups never scale down the minimum. Spot is "scary". EKS clusters are spun up "per environment" and forgotten about.

Symptoms:

  • CloudWatch shows your average CPU utilisation across the fleet under 30%.
  • Multiple EKS clusters (dev/staging/prod is fine — but 6+ is a smell).
  • Engineers can't tell you what's running on a given instance type.

Quick wins:

  1. Stop / right-size idle instances. AWS Compute Optimizer flags them; cross-reference with your own monitoring.
  2. Consolidate EKS control planes. $0.10/hour each. Three is reasonable, eight is not. Use namespaces + RBAC instead.
  3. Mix Spot + Savings Plans into the steady-state. Stateless workloads on Spot; the static minimum on Compute Savings Plans (1-year No Upfront is the lowest-friction starting point).

Run the EKS calculator →

Read “Why is EKS so expensive?” →


2. Data transfer — invisible, expensive, fixable

Why it grows: AWS charges for almost every byte that moves out of one of its boundaries. The two big ones:

  • Cross-AZ traffic: $0.01/GB each way. Two pods chatting across AZs cost you twice — once out of the source AZ, once into the destination.
  • NAT Gateway: $0.045/GB processed on top of the $0.045/hour uptime. A single NAT GW handling 5 TB/month costs ~$258/mo just for the byte tax.

Symptoms:

  • "Data Transfer" line on the bill is the second-largest item.
  • Multiple NAT gateways processing >1 TB/month.
  • Lots of cross-AZ Kafka / RDS-replica / Redis traffic.

Quick wins:

  1. Add S3 + DynamoDB Gateway endpoints. They're free. Any traffic from a private subnet to S3 or DynamoDB stops paying NAT toll immediately. This single Terraform PR is often the largest dollar fix you'll make this year.
  2. Use interface (PrivateLink) endpoints for chatty AWS services. ECR pulls, CloudWatch Logs, Secrets Manager, KMS — they aren't free ($0.01/hour each), but they save NAT $/GB at any meaningful volume.
  3. Pin chatty services to a single AZ. Kafka brokers + their consumers, RDS replicas + their readers — same AZ is the right answer when you don't need cross-AZ HA on a hot path.

Run the NAT Gateway calculator →

Read “Reduce AWS NAT Gateway cost” →


3. EBS — the silent killer

Why it grows: every EC2 instance you launch comes with at least one EBS volume. When you terminate the instance, the volume sometimes stays attached, sometimes goes orphaned. Snapshots accumulate quietly behind your back.

Symptoms:

  • A surprisingly long list of "available" volumes in the EC2 console.
  • Hundreds of snapshots older than 90 days you don't recognise.
  • Still on gp2 for most volumes.

Quick wins:

  1. Migrate gp2 → gp3. A roughly 20% cost cut for the same baseline IOPS, and it's online — no downtime. AWS ModifyVolume runs in the background while the workload keeps using the volume.
  2. Delete unattached EBS volumes. Snapshot them first if you're nervous; the snapshot costs ~half the per-GB rate.
  3. Lifecycle old snapshots. AWS Data Lifecycle Manager + the Recycle Bin together replace the "I'll go through these one day" backlog.

Run the gp2 → gp3 calculator →

Read “gp2 vs gp3 cost savings” →


4. RDS — right-sized once a year, never again

Why it grows: RDS instances were sized for a load test that ran in 2023. Multi-AZ doubles the cost. Provisioned IOPS were over-set "just in case". Backups + transaction logs accumulate.

Symptoms:

  • One dev/staging cluster per environment, all on db.m5.large or larger.
  • Multi-AZ on non-production clusters.
  • A db.m5.4xlarge for an app that hits the database 5 times per minute.

Quick wins:

  1. Snapshot + delete non-prod replicas you don't actively use. Recreate from snapshot in minutes when you do.
  2. Drop Multi-AZ on staging. Multi-AZ doubles the cost; staging downtime isn't a paged incident.
  3. Right-size production one tier down. RDS lets you scale up in minutes if you misjudged; engineers consistently overestimate the size they need.

5. The long tail — CloudWatch, S3, Lambda, "Other"

Why it grows: every team adds another monitoring agent ("just turn on Container Insights"), every bucket gets versioning enabled "for safety", every Lambda emits a verbose log line for every invocation.

Symptoms:

  • CloudWatch Logs is your 4th-largest line item.
  • S3 storage class is 100% Standard for everything, including backups from 2024.
  • Lambda costs are 10× what your last bill said.

Quick wins:

  1. Set CloudWatch Logs retention. Default is "Never expire". 30 days is fine for app logs; 1 year for audit logs.
  2. Move old S3 objects to Glacier / Deep Archive with a lifecycle policy.
  3. Disable S3 bucket versioning on buckets that don't need it. Versioning multiplies your storage cost by the number of revisions you keep.

Where to go next

  • Run the Cloud Waste Radar with eight rough inputs about your footprint → it tells you which of the five line items above is leaking the most money for your specific shape.
  • Read the deeper guides for EKS, NAT Gateway, or gp2 → gp3 when one of them jumps out.
  • Use the startup checklist when you want a Friday-afternoon punch-list ranked by dollar size.

Want the actual numbers, not estimates?

Book a free 30-minute audit and we'll go through your bill line by line — read-only AWS access, signed audit trail, no obligation. We'll come back with the exact $/month next to each lever above for your account.

Book a free audit →