How a simple S3 design decision turned into a $7M cost
The hidden tax of 1.5 trillion objects and why your lifecycle policies might be a financial time bomb.
AWS S3 is widely viewed as inexpensive, effectively unbounded object storage. In this case study, that is exactly how it behaved - with the caveat that the data storage decisions our team took cost us close to 40x more than they should have, and a lifecycle policy decision nearly exploded into a $7.2 million bill whilst trying to stop the cost bleed.
The Trillion-Object Blind Spot
This article analyses a production system that accumulated 5.6 PB of data across 1.56 trillion objects in a single bucket. Within one year, monthly storage cost increased from approximately $100k to over $400k, with forecasts exceeding $1M per month just 12 months later.
The root cause was not data volume alone, but architectural fragmentation misaligned with S3’s pricing model. The architecture generated hundreds of small snapshot artifacts per request, causing object count to grow faster than volume. A consolidation experiment showed that by aligning object granularity with S3’s economic structure, equivalent logical data could have been stored at 37x lower monthly cost. This case demonstrates that cost modelling must be treated as a first-class architectural constraint.
Paying for an Index, Getting a Bucket
It’s easy in our day to day to simplify storage costs as purely volumetric - e.g., C≈V, where C = monthly cost and V = stored volume.
For AWS S3, an accurate representation requires considering many more facets. Most of them you can find on the official AWS Pricing calculator - hence I will skip the formula definitions here.
There is a crucial cost factor that most people skip though - and is also omitted in the official AWS cost calculator. Object cardinality. At small scale, object cardinality is negligible relative to volume but at larger scales it becomes a first-order variable.
The main variable that determines whether object count matters is average object size, with the following formula:
Where s = Average object size (GB), V = Total stored volume (GB) and N = total object count.
When average object size falls into the kilobyte range, per-object pricing dominates. The system no longer behaves like bulk storage, instead it’s more akin to a massively distributed index - except you are paying storage-layer economics for index-layer behaviour.
A Real-World Example
In the system analysed, our bucket composition was the following:
5.6 petabytes stored
1.56 trillion objects
~3.5 KB average object size
$400k/month storage cost
~$50k/month in request charges
The architecture generated hundreds to thousands of small snapshot artifacts per back end request. Over time, fragmentation compounded. Object count grew faster than volume.
A consolidation experiment showed that the same data could instead have been stored in artifacts averaging ~2.5 MB.
This would have reduced storage cost from $457,000 to $11,900 per month for the same volume of data. This represents a 37× structural reduction.
This reduction was not due to compression or deletion of data. The total logical volume remained constant. Only object granularity changed.
The $7.2 Million Lifecycle "Ghost"
Another (nearly) expensive lesson came from trying to fix our exponentially increasing storage costs. Without permanently losing important insights into our service, the only cost saving alternative we believed to have was resorting to lifecycle transition policies.
At that point, the bucket that now holds over 1.56 trillion objects had 720 billion objects in it. We came up with a plan for all objects older than 6 months to transition automatically with Lifecycle Policies from Standard Access to Infrequent Access, then eventually into Glacier.
This solution could have had a seven figure cost, as lifecycle transitions are priced per 1,000 objects. Putting pen to paper, the lifecycle transitions calculation would have been:
What saved us
Approximately $7.2 million — for a single lifecycle rule on a single bucket. This excludes ongoing storage cost in the new tiers on top of that.
The transition did not execute because most objects were smaller than 128 KB, which do not transition by default. Hence the irony of the same fragmentation pattern that caused excessive steady-state cost also preventing an even larger transition bill.
Why Your Remediation Plan Might Bankrupt You
To prevent similar failures, object storage systems should be evaluated across three explicit budgets.
1. Volume Budget
Projected monthly storage cost.
2. Cardinality Budget
Total object count and average object size.
If average object size falls below a defined threshold (e.g. 1–10 MB for snapshot systems), object count becomes a risk indicator.
3. Remediation Budget
Cost of rewriting, transitioning, or migrating all objects.
Before implementing lifecycle rules or structural migrations, compute:
If remediation cost exceeds acceptable monthly spend, the architecture is already broken.
Object count must be monitored alongside stored bytes. Divergence between the two is architectural drift, not growth.
Conclusion
Our system scaled flawlessly. It simply became unaffordable, under:
Multi-petabyte scale
Trillion-object cardinality
Monthly cost growth from $100k to $400k
Forecast exceeding $1M/month
A potential $7M lifecycle event
Object storage is not purely volumetric. It is priced across bytes, objects, and operations.
At extreme scale, pricing semantics become architectural constraints.
Cost modelling must be treated as a first-class design discipline.
If you want a more in-depth operational view on how to identify and tackle these AWS S3 storage issues, take a look at this article:
Disclaimer: Based on public AWS pricing and production experience. Not an official AWS statement.



