How a simple S3 design decision turned into a $7M cost

The hidden tax of 1.5 trillion objects and why your lifecycle policies might be a financial time bomb.

Apr 20, 2026

AWS S3 is widely viewed as inexpensive, effectively unbounded object storage. In this case study, that is exactly how it behaved - with the caveat that the data storage decisions our team took cost us close to 40x more than they should have, and a lifecycle policy decision nearly exploded into a $7.2 million bill whilst trying to stop the cost bleed.

The Trillion-Object Blind Spot

This article analyses a production system that accumulated 5.6 PB of data across 1.56 trillion objects in a single bucket. Within one year, monthly storage cost increased from approximately $100k to over $400k, with forecasts exceeding $1M per month just 12 months later.

The root cause was not data volume alone, but architectural fragmentation misaligned with S3’s pricing model. The architecture generated hundreds of small snapshot artifacts per request, causing object count to grow faster than volume. A consolidation experiment showed that by aligning object granularity with S3’s economic structure, equivalent logical data could have been stored at 37x lower monthly cost. This case demonstrates that cost modelling must be treated as a first-class architectural constraint.

Paying for an Index, Getting a Bucket

It’s easy in our day to day to simplify storage costs as purely volumetric - e.g., C≈V, where C = monthly cost and V = stored volume.

For AWS S3, an accurate representation requires considering many more facets. Most of them you can find on the official AWS Pricing calculator - hence I will skip the formula definitions here.

There is a crucial cost factor that most people skip though - and is also omitted in the official AWS cost calculator. Object cardinality. At small scale, object cardinality is negligible relative to volume but at larger scales it becomes a first-order variable.

The main variable that determines whether object count matters is average object size, with the following formula:

$s≈ \frac{V}{N}$

Where s = Average object size (GB), V = Total stored volume (GB) and N = total object count.

When average object size falls into the kilobyte range, per-object pricing dominates. The system no longer behaves like bulk storage, instead it’s more akin to a massively distributed index - except you are paying storage-layer economics for index-layer behaviour.

A Real-World Example

In the system analysed, our bucket composition was the following:

5.6 petabytes stored
1.56 trillion objects
~3.5 KB average object size
$400k/month storage cost
~$50k/month in request charges

The architecture generated hundreds to thousands of small snapshot artifacts per back end request. Over time, fragmentation compounded. Object count grew faster than volume.

A consolidation experiment showed that the same data could instead have been stored in artifacts averaging ~2.5 MB.

This would have reduced storage cost from $457,000 to $11,900 per month for the same volume of data. This represents a 37× structural reduction.

This reduction was not due to compression or deletion of data. The total logical volume remained constant. Only object granularity changed.

The $7.2 Million Lifecycle "Ghost"

Another (nearly) expensive lesson came from trying to fix our exponentially increasing storage costs. Without permanently losing important insights into our service, the only cost saving alternative we believed to have was resorting to lifecycle transition policies.

At that point, the bucket that now holds over 1.56 trillion objects had 720 billion objects in it. We came up with a plan for all objects older than 6 months to transition automatically with Lifecycle Policies from Standard Access to Infrequent Access, then eventually into Glacier.

This solution could have had a seven figure cost, as lifecycle transitions are priced per 1,000 objects. Putting pen to paper, the lifecycle transitions calculation would have been:

$720,000,000,000 / 1,000 × $0.01 = $7,200,000$

What saved us

Approximately $7.2 million — for a single lifecycle rule on a single bucket. This excludes ongoing storage cost in the new tiers on top of that.

The transition did not execute because most objects were smaller than 128 KB, which do not transition by default. Hence the irony of the same fragmentation pattern that caused excessive steady-state cost also preventing an even larger transition bill.

Why Your Remediation Plan Might Bankrupt You

To prevent similar failures, object storage systems should be evaluated across three explicit budgets.

1. Volume Budget

Projected monthly storage cost.

2. Cardinality Budget

Total object count and average object size.

If average object size falls below a defined threshold (e.g. 1–10 MB for snapshot systems), object count becomes a risk indicator.

3. Remediation Budget

Cost of rewriting, transitioning, or migrating all objects.

Before implementing lifecycle rules or structural migrations, compute:

$C_{\text{transition}} = \frac{\text{Objects}}{1000} \cdot \text{Price}_{1000}$

If remediation cost exceeds acceptable monthly spend, the architecture is already broken.

Object count must be monitored alongside stored bytes. Divergence between the two is architectural drift, not growth.

Conclusion

Our system scaled flawlessly. It simply became unaffordable, under:

Multi-petabyte scale
Trillion-object cardinality
Monthly cost growth from $100k to $400k
Forecast exceeding $1M/month
A potential $7M lifecycle event

Object storage is not purely volumetric. It is priced across bytes, objects, and operations.

At extreme scale, pricing semantics become architectural constraints.

Cost modelling must be treated as a first-class design discipline.

If you want a more in-depth operational view on how to identify and tackle these AWS S3 storage issues, take a look at this article:

The S3 at Scale Runbook

Adrian Gabardo

Mar 31

Read full story

Disclaimer: Based on public AWS pricing and production experience. Not an official AWS statement.

Gabardo Engineering

The S3 at Scale Runbook

Discussion about this post

Ready for more?