Over-Provisioned by Default

It's one of the most common patterns in AWS: an ad platform spins up infrastructure to handle a big campaign, the campaign ends, and the infrastructure stays. Multiply that by two years of growth and you get what we found at a national advertising agency - a compute fleet where the average instance was using less than a third of its provisioned resources.

The team wasn't being careless. They were being cautious. In adtech, latency kills revenue. A slow ad-serve means lost impressions and unhappy clients. So when in doubt, the instinct was always to go bigger - larger instance types, more memory, faster storage. The problem is that "bigger" has a monthly cost, and over time those costs compound.

When we audited their environment, the oversizing was consistent across the board. But so was the opportunity.

The Rightsizing Process

Rightsizing sounds simple - use smaller instances. In practice, it requires careful analysis to avoid performance problems. Here's how we approached it:

Step 1: Baseline Everything

We pulled CloudWatch metrics for every EC2 instance over a 30-day window. For each instance, we tracked:

Average and peak CPU utilization
Memory usage (via CloudWatch agent - not available by default)
Network throughput (in/out)
Disk I/O (IOPS and throughput)

This gave us a realistic picture of what each instance actually needed versus what it was provisioned for. The CloudWatch agent piece is critical - without memory metrics, you're guessing.

Step 2: Categorize and Prioritize

We grouped instances by workload type and sorted by waste:

| Workload | Instance Type | Avg CPU | Avg Memory | Recommendation | |----------|--------------|---------|------------|----------------| | Ad-serving API | r5.2xlarge | 18% | 22% | r5.large | | Campaign analytics | m5.4xlarge | 25% | 31% | m5.xlarge | | Asset processing | c5.4xlarge | 35% | 15% | c5.xlarge | | Internal tools | t3.xlarge | 8% | 12% | t3.medium |

The ad-serving API was the biggest opportunity: running on r5.2xlarge instances (8 vCPUs, 64 GB RAM) when actual workload needed an r5.large (2 vCPUs, 16 GB RAM). That's a 4x overprovisioning on a fleet of dozens of instances.

Step 3: Test Before Switching

We didn't just swap instance types and hope for the best. For each workload category, we:

Launched the recommended instance type alongside the existing one
Shifted a portion of traffic to the new instance
Monitored CPU, memory, latency, and error rates for 48-72 hours
Confirmed performance parity before proceeding

This validation step is where most DIY rightsizing efforts fall short. The data might say an instance is underutilized, but you need to verify that the new size can handle traffic spikes and peak processing loads - not just the average.

Step 4: Execute the Migration

Once validated, we migrated workloads in rolling batches. Zero downtime, zero performance impact. The compute savings alone were significant - but we weren't done.

Storage Optimization: The Hidden Cost Center

Storage costs tend to be invisible. They're not as dramatic as compute costs, and they accumulate gradually. But in an adtech environment generating terabytes of campaign data, creative assets, and analytics output, storage adds up fast.

Here's what we found and fixed:

EBS Volume Rightsizing

gp2 volumes sized for burst performance they never used - Many volumes were provisioned at 500GB+ specifically because gp2 ties IOPS to volume size. We migrated these to gp3, which provides 3,000 baseline IOPS regardless of size, then reduced volumes to match actual usage
Provisioned IOPS (io1) volumes on non-critical workloads - Some internal analytics databases were running on io1 volumes provisioned for 10,000 IOPS when actual usage averaged 800 IOPS. Migrated to gp3 with a cost reduction of over 70% per volume
Orphaned volumes - 23 EBS volumes attached to nothing. Leftover from terminated instances that someone forgot to clean up. Easy savings - just delete them after confirming no data recovery need

S3 Lifecycle Policies

The agency's S3 buckets were a time capsule. Campaign assets, raw analytics data, and processed reports going back years - all sitting in S3 Standard storage class at full price.

We implemented tiered lifecycle policies:

Campaign assets older than 90 days → S3 Infrequent Access (40% cheaper)
Raw analytics data older than 30 days → S3 Infrequent Access
Raw analytics data older than 180 days → S3 Glacier Instant Retrieval (68% cheaper)
Processed reports older than 1 year → S3 Glacier Deep Archive (95% cheaper)
Incomplete multipart uploads → auto-abort after 7 days

The lifecycle policies alone reduced S3 costs meaningfully without deleting a single file. The data is still accessible - it just costs less to store.

Snapshot Cleanup

EBS snapshots are another quiet cost accumulator. The team had daily snapshots going back 14 months with no retention policy. We implemented automated snapshot management:

Keep daily snapshots for 7 days
Keep weekly snapshots for 4 weeks
Keep monthly snapshots for 12 months
Delete everything else

This reduced snapshot storage costs substantially while maintaining a reasonable backup history.

Combined Results

The full results are detailed in our advertising firm case study, but the rightsizing and storage optimization specifically delivered:

Instance costs reduced by approximately 40% through rightsizing alone - no application changes required
EBS costs reduced by over 50% through gp3 migration, volume rightsizing, and orphan cleanup
S3 costs reduced by approximately 35% through lifecycle policies and storage class optimization
Snapshot costs cut significantly through automated retention policies
Zero performance degradation - validated through staged testing before every change

The total AWS cost reduction across all optimization categories came to 30% annually. Rightsizing and storage optimization were the largest contributors, but they were part of a broader effort that included auto scaling optimization and commitment strategy (Savings Plans for the newly right-sized baseline).

Why AdTech Companies Are Especially Prone to This

Advertising technology companies share a few characteristics that make them particularly susceptible to over-provisioning and storage sprawl:

Revenue-sensitive latency - When a slow response means lost ad impressions, teams default to bigger infrastructure. Rightsizing feels risky when your revenue depends on milliseconds
Campaign-driven data accumulation - Every campaign generates assets, analytics, and reports. Without lifecycle policies, storage grows indefinitely
Rapid growth - AdTech companies often scale infrastructure quickly to support new clients. The infrastructure decisions made during rapid growth rarely get revisited
Peak provisioning - Campaign launches create genuine spikes. But provisioning for the spike and leaving it running is dramatically more expensive than scaling dynamically

Where to Start

If you suspect your AWS environment is over-provisioned, here's a practical starting point:

Install the CloudWatch agent on every EC2 instance if you haven't already - you need memory metrics, not just CPU
Pull 30 days of utilization data and look for instances consistently below 40% CPU and memory
Check your EBS volumes - are you still on gp2? Do you have orphaned volumes? Is anything on io1 that doesn't need it?
Audit your S3 buckets - sort by size and check the last access dates. If data hasn't been accessed in 90+ days, it belongs in a cheaper storage class
Review your snapshot retention - if there's no policy, you're probably keeping far more than you need

If you want a thorough analysis of your specific environment, schedule a free consultation. We'll review your utilization data, identify the biggest savings opportunities, and give you a prioritized action plan - whether you implement it yourself or bring us in to help.

Over-Provisioned by Default

When we audited their environment, the oversizing was consistent across the board. But so was the opportunity.

The Rightsizing Process

Rightsizing sounds simple - use smaller instances. In practice, it requires careful analysis to avoid performance problems. Here's how we approached it:

Step 1: Baseline Everything

We pulled CloudWatch metrics for every EC2 instance over a 30-day window. For each instance, we tracked:

Average and peak CPU utilization
Memory usage (via CloudWatch agent - not available by default)
Network throughput (in/out)
Disk I/O (IOPS and throughput)

This gave us a realistic picture of what each instance actually needed versus what it was provisioned for. The CloudWatch agent piece is critical - without memory metrics, you're guessing.

Step 2: Categorize and Prioritize

We grouped instances by workload type and sorted by waste:

Step 3: Test Before Switching

We didn't just swap instance types and hope for the best. For each workload category, we:

Launched the recommended instance type alongside the existing one
Shifted a portion of traffic to the new instance
Monitored CPU, memory, latency, and error rates for 48-72 hours
Confirmed performance parity before proceeding

Step 4: Execute the Migration

Once validated, we migrated workloads in rolling batches. Zero downtime, zero performance impact. The compute savings alone were significant - but we weren't done.

Storage Optimization: The Hidden Cost Center

Here's what we found and fixed:

EBS Volume Rightsizing

gp2 volumes sized for burst performance they never used - Many volumes were provisioned at 500GB+ specifically because gp2 ties IOPS to volume size. We migrated these to gp3, which provides 3,000 baseline IOPS regardless of size, then reduced volumes to match actual usage
Provisioned IOPS (io1) volumes on non-critical workloads - Some internal analytics databases were running on io1 volumes provisioned for 10,000 IOPS when actual usage averaged 800 IOPS. Migrated to gp3 with a cost reduction of over 70% per volume
Orphaned volumes - 23 EBS volumes attached to nothing. Leftover from terminated instances that someone forgot to clean up. Easy savings - just delete them after confirming no data recovery need

S3 Lifecycle Policies

The agency's S3 buckets were a time capsule. Campaign assets, raw analytics data, and processed reports going back years - all sitting in S3 Standard storage class at full price.

We implemented tiered lifecycle policies:

Campaign assets older than 90 days → S3 Infrequent Access (40% cheaper)
Raw analytics data older than 30 days → S3 Infrequent Access
Raw analytics data older than 180 days → S3 Glacier Instant Retrieval (68% cheaper)
Processed reports older than 1 year → S3 Glacier Deep Archive (95% cheaper)
Incomplete multipart uploads → auto-abort after 7 days

The lifecycle policies alone reduced S3 costs meaningfully without deleting a single file. The data is still accessible - it just costs less to store.

Snapshot Cleanup

EBS snapshots are another quiet cost accumulator. The team had daily snapshots going back 14 months with no retention policy. We implemented automated snapshot management:

Keep daily snapshots for 7 days
Keep weekly snapshots for 4 weeks
Keep monthly snapshots for 12 months
Delete everything else

This reduced snapshot storage costs substantially while maintaining a reasonable backup history.

Combined Results

The full results are detailed in our advertising firm case study, but the rightsizing and storage optimization specifically delivered:

Instance costs reduced by approximately 40% through rightsizing alone - no application changes required
EBS costs reduced by over 50% through gp3 migration, volume rightsizing, and orphan cleanup
S3 costs reduced by approximately 35% through lifecycle policies and storage class optimization
Snapshot costs cut significantly through automated retention policies
Zero performance degradation - validated through staged testing before every change

Why AdTech Companies Are Especially Prone to This

Advertising technology companies share a few characteristics that make them particularly susceptible to over-provisioning and storage sprawl:

Revenue-sensitive latency - When a slow response means lost ad impressions, teams default to bigger infrastructure. Rightsizing feels risky when your revenue depends on milliseconds
Campaign-driven data accumulation - Every campaign generates assets, analytics, and reports. Without lifecycle policies, storage grows indefinitely
Rapid growth - AdTech companies often scale infrastructure quickly to support new clients. The infrastructure decisions made during rapid growth rarely get revisited
Peak provisioning - Campaign launches create genuine spikes. But provisioning for the spike and leaving it running is dramatically more expensive than scaling dynamically

Where to Start

If you suspect your AWS environment is over-provisioned, here's a practical starting point:

Install the CloudWatch agent on every EC2 instance if you haven't already - you need memory metrics, not just CPU
Pull 30 days of utilization data and look for instances consistently below 40% CPU and memory
Check your EBS volumes - are you still on gp2? Do you have orphaned volumes? Is anything on io1 that doesn't need it?
Audit your S3 buckets - sort by size and check the last access dates. If data hasn't been accessed in 90+ days, it belongs in a cheaper storage class
Review your snapshot retention - if there's no policy, you're probably keeping far more than you need

Right Sizing and Storage Optimization: Lessons from an Ad Platform's AWS Overhaul

Over-Provisioned by Default

The Rightsizing Process

Step 1: Baseline Everything

Step 2: Categorize and Prioritize

Step 3: Test Before Switching

Step 4: Execute the Migration

Storage Optimization: The Hidden Cost Center

EBS Volume Rightsizing

S3 Lifecycle Policies

Snapshot Cleanup

Combined Results

Why AdTech Companies Are Especially Prone to This

Where to Start

Related Articles

Why Your AWS Bill Keeps Climbing — And What To Do Before It Gets Worse

AWS Wednesday: S3 Turns 20, Lambda Gets Rust, and Your Cloud Bill Still Has Secrets

Your AWS Bill Has a Story — Here's How to Read It

Ready to discuss your project?

Right Sizing and Storage Optimization: Lessons from an Ad Platform's AWS Overhaul

Over-Provisioned by Default

The Rightsizing Process

Step 1: Baseline Everything

Step 2: Categorize and Prioritize

Step 3: Test Before Switching

Step 4: Execute the Migration

Storage Optimization: The Hidden Cost Center

EBS Volume Rightsizing

S3 Lifecycle Policies

Snapshot Cleanup

Combined Results

Why AdTech Companies Are Especially Prone to This

Where to Start

Related Articles

Why Your AWS Bill Keeps Climbing — And What To Do Before It Gets Worse

AWS Wednesday: S3 Turns 20, Lambda Gets Rust, and Your Cloud Bill Still Has Secrets

Your AWS Bill Has a Story — Here's How to Read It

Ready to discuss your project?