The Problem: Paying for Peak Capacity 24/7

When a national advertising agency came to us, their AWS bill was growing faster than their revenue. The root cause was straightforward once we looked at the data: they were provisioned for worst-case load around the clock.

Their ad-serving infrastructure, campaign analytics pipelines, and creative asset processing all ran on fixed-size EC2 fleets. During campaign launches, traffic would spike 4-5x - so the team had sized everything to handle that peak. The other 90% of the time, those instances were sitting at 15-25% CPU utilization.

It's a pattern we see constantly in adtech and media companies. Traffic is inherently spiky - tied to campaign launches, dayparting, and seasonal cycles - but infrastructure stays flat.

What We Found in the Utilization Audit

Before recommending anything, we pulled four weeks of CloudWatch metrics across their entire compute fleet. The numbers told a clear story:

72% of instances were running below 30% average CPU utilization
Campaign analytics clusters peaked for 3-4 hours per day, then sat nearly idle
Creative asset processing spiked during business hours but dropped to near-zero overnight and on weekends
Ad-serving instances had legitimate traffic variation but were sized for the absolute worst case with no scaling policy

The waste wasn't coming from one place - it was spread across the entire infrastructure. But the fix didn't require rearchitecting anything. It required matching capacity to demand in real time.

The Auto Scaling Strategy

We designed three different scaling approaches matched to the actual traffic patterns of each workload:

1. Target Tracking for Ad-Serving

Ad-serving traffic is unpredictable - it depends on which campaigns are active, bid volumes, and real-time auction activity. For these workloads, we implemented target tracking auto scaling policies tied to CPU utilization and request count.

The key was setting the right target. Too aggressive and you get scaling thrash. Too conservative and you're still overpaying. We tested with historical traffic data and landed on a 60% CPU target with a 3-minute cooldown - responsive enough to handle a campaign launch spike within minutes, stable enough to avoid constant scaling noise.

2. Scheduled Scaling for Predictable Patterns

Campaign analytics and reporting followed a clear daily pattern: heavy usage during business hours (8am-8pm EST), minimal overnight. Weekend traffic dropped 70% compared to weekdays.

For these workloads, scheduled scaling was the obvious choice. We set up cron-based policies that scaled the fleet down by 60% overnight and 70% on weekends, then back up before the next business day. Simple, predictable, and immediately effective.

3. Queue-Based Scaling for Processing

Creative asset processing - video transcoding, image resizing, banner generation - was driven by internal submissions, not external traffic. We replaced the fixed fleet with an auto scaling group tied to SQS queue depth.

When designers uploaded a batch of assets, the queue filled up, instances spun up to process them, and then scaled back to a minimal baseline once the queue drained. Processing time stayed the same. Costs dropped dramatically because we stopped paying for idle capacity between batches.

The Results

You can read the full case study for the complete breakdown, but the headline numbers:

30% reduction in annual AWS compute spend
Zero degradation to ad-serving latency or analytics throughput
Compute costs now track actual demand instead of theoretical peak capacity
Campaign launch spikes handled automatically - no more manual instance provisioning
Overnight and weekend waste eliminated through scheduled scaling

The 30% number is the annual average. During off-peak hours, the savings are significantly higher - closer to 60-70% compared to what they were paying before.

Why This Matters for AdTech Companies

Advertising and media companies have some of the spikiest traffic patterns in any industry. Campaign launches, real-time bidding, seasonal events, and dayparting all create load that varies dramatically hour to hour.

If you're running a fixed compute fleet to handle those spikes, you're almost certainly overpaying. The gap between peak capacity and average utilization is where the waste lives.

Auto scaling isn't a new concept - but the implementation details matter. The wrong scaling policy can cause latency spikes during scale-up events or leave you exposed during a traffic surge. Getting it right requires understanding both the traffic patterns and the tolerance for scaling lag in each workload.

Getting Started

If your AWS bill feels too high relative to your actual traffic, start here:

Pull utilization data - look at 4 weeks of CloudWatch metrics for CPU, memory, and network across your compute fleet
Identify the pattern - is the workload spiky, predictable, or queue-driven? The pattern determines the scaling approach
Size your baseline - what's the minimum fleet size you need at the quietest point? That's your starting point
Test with historical data - use past traffic patterns to validate your scaling policy before applying it to production

If you want help analyzing your AWS environment for auto scaling opportunities, schedule a free consultation. We'll look at your actual utilization data and tell you where the savings are.

The Problem: Paying for Peak Capacity 24/7

It's a pattern we see constantly in adtech and media companies. Traffic is inherently spiky - tied to campaign launches, dayparting, and seasonal cycles - but infrastructure stays flat.

What We Found in the Utilization Audit

Before recommending anything, we pulled four weeks of CloudWatch metrics across their entire compute fleet. The numbers told a clear story:

72% of instances were running below 30% average CPU utilization
Campaign analytics clusters peaked for 3-4 hours per day, then sat nearly idle
Creative asset processing spiked during business hours but dropped to near-zero overnight and on weekends
Ad-serving instances had legitimate traffic variation but were sized for the absolute worst case with no scaling policy

The waste wasn't coming from one place - it was spread across the entire infrastructure. But the fix didn't require rearchitecting anything. It required matching capacity to demand in real time.

The Auto Scaling Strategy

We designed three different scaling approaches matched to the actual traffic patterns of each workload:

1. Target Tracking for Ad-Serving

2. Scheduled Scaling for Predictable Patterns

Campaign analytics and reporting followed a clear daily pattern: heavy usage during business hours (8am-8pm EST), minimal overnight. Weekend traffic dropped 70% compared to weekdays.

3. Queue-Based Scaling for Processing

The Results

You can read the full case study for the complete breakdown, but the headline numbers:

30% reduction in annual AWS compute spend
Zero degradation to ad-serving latency or analytics throughput
Compute costs now track actual demand instead of theoretical peak capacity
Campaign launch spikes handled automatically - no more manual instance provisioning
Overnight and weekend waste eliminated through scheduled scaling

The 30% number is the annual average. During off-peak hours, the savings are significantly higher - closer to 60-70% compared to what they were paying before.

Why This Matters for AdTech Companies

If you're running a fixed compute fleet to handle those spikes, you're almost certainly overpaying. The gap between peak capacity and average utilization is where the waste lives.

Getting Started

If your AWS bill feels too high relative to your actual traffic, start here:

Pull utilization data - look at 4 weeks of CloudWatch metrics for CPU, memory, and network across your compute fleet
Identify the pattern - is the workload spiky, predictable, or queue-driven? The pattern determines the scaling approach
Size your baseline - what's the minimum fleet size you need at the quietest point? That's your starting point
Test with historical data - use past traffic patterns to validate your scaling policy before applying it to production

If you want help analyzing your AWS environment for auto scaling opportunities, schedule a free consultation. We'll look at your actual utilization data and tell you where the savings are.

How an AdTech Firm Cut AWS Compute Costs 30% with Auto Scaling Optimization

The Problem: Paying for Peak Capacity 24/7

What We Found in the Utilization Audit

The Auto Scaling Strategy

1. Target Tracking for Ad-Serving

2. Scheduled Scaling for Predictable Patterns

3. Queue-Based Scaling for Processing

The Results

Why This Matters for AdTech Companies

Getting Started

Related Articles

Why Your AWS Bill Keeps Climbing — And What To Do Before It Gets Worse

AWS Wednesday: S3 Turns 20, Lambda Gets Rust, and Your Cloud Bill Still Has Secrets

Your AWS Bill Has a Story — Here's How to Read It

Ready to discuss your project?

How an AdTech Firm Cut AWS Compute Costs 30% with Auto Scaling Optimization

The Problem: Paying for Peak Capacity 24/7

What We Found in the Utilization Audit

The Auto Scaling Strategy

1. Target Tracking for Ad-Serving

2. Scheduled Scaling for Predictable Patterns

3. Queue-Based Scaling for Processing

The Results

Why This Matters for AdTech Companies

Getting Started

Related Articles

Why Your AWS Bill Keeps Climbing — And What To Do Before It Gets Worse

AWS Wednesday: S3 Turns 20, Lambda Gets Rust, and Your Cloud Bill Still Has Secrets

Your AWS Bill Has a Story — Here's How to Read It

Ready to discuss your project?