Welcome to AWS Wednesday — our weekly cloud recap. No product pitches, no certification prep. Just the stuff that matters if you're building on AWS.
This week is a big one. Amazon S3 turned 20, Lambda got a major language upgrade, Route 53 went truly global, and Cerebras brought blazing-fast inference to Bedrock. Let's get into it.
S3 Turns 20: The Service That Created Cloud Computing
On March 14, 2006, Amazon launched Simple Storage Service — S3 — and quietly started a revolution. There was no keynote. No hype cycle. Just a REST API that let you store and retrieve objects over HTTP for $0.15 per gigabyte per month.
Twenty years later, S3 stores over 500 trillion objects and handles more than 200 million requests per second. Those numbers are almost incomprehensible. If each object were a grain of sand, you'd have enough to fill roughly 1,600 Olympic swimming pools. Every second, S3 handles more requests than the entire internet did in a day back in 2006.
But the numbers aren't really the story. The story is what S3 made possible.
Before S3, if you wanted to store files at scale, you bought hardware. You racked servers. You managed RAID arrays and worried about disk failures at 3 AM. Startups couldn't afford the infrastructure that enterprises took for granted, and enterprises couldn't move fast enough to compete with startups.
S3 erased that. A two-person startup in a garage could store petabytes of data with the same reliability as a Fortune 500 company. The API was the equalizer.
And it wasn't just storage. S3 proved the model. It proved that infrastructure could be an API call, that you could pay for what you use, and that "someone else's problem" was actually a valid infrastructure strategy. Every cloud service that came after — EC2, Lambda, DynamoDB, every managed service on every cloud provider — exists because S3 proved the economics worked.
The price trajectory tells its own story. That original $0.15/GB/month is now as low as $0.0036/GB/month on S3 Glacier Deep Archive. Standard storage sits at $0.023/GB/month. That's an 85% drop for standard and a 97% drop for archival. And the service is incomparably better — versioning, replication, intelligent tiering, event notifications, Object Lambda, server-side encryption with customer-managed keys. The S3 of 2026 barely resembles the S3 of 2006, except in the one way that matters: it still just works.
What the next 20 years look like: Expect storage to become even more invisible. S3 Express One Zone already offers single-digit millisecond latency. Intelligent-Tiering is getting smarter. The trend is toward storage that optimizes itself — you put data in, and the system figures out the cheapest, fastest way to keep it available. We're moving from "manage your storage" to "forget about your storage."
Route 53 Global Resolver Goes GA
Amazon Route 53 Resolver now has a global resolver that went generally available this month. If you're running multi-region architectures, this one is worth paying attention to.
What it is: Previously, Route 53 Resolver endpoints were regional. You'd set up resolver rules in us-east-1, and those rules applied to us-east-1. If you wanted the same DNS resolution behavior in eu-west-1, you'd duplicate the configuration. At scale, this became a management headache — and more importantly, it introduced latency for cross-region queries.
The global resolver changes this. You configure your DNS resolution rules once, and they apply everywhere. Route 53 automatically routes queries to the nearest resolver endpoint, which means faster resolution times and simpler architecture.
Who should care: Anyone running workloads across three or more regions. Anyone with hybrid cloud setups where on-premises DNS needs to talk to AWS. Anyone who's been maintaining parallel resolver configurations and hating it.
The practical impact: Faster DNS resolution means faster first-byte times for your users. In a multi-region active-active setup, shaving 20-50ms off DNS resolution compounds across every request. For applications where latency is revenue — e-commerce, financial services, real-time APIs — this is meaningful.
It's not a flashy announcement, but it's the kind of infrastructure improvement that makes everything else run a little bit better.
Lambda Now Supports Rust Natively
This is the one that's going to make systems programmers very happy. AWS Lambda now supports Rust as a first-class runtime through managed instances. No more custom runtimes. No more wrapping Rust binaries in provided.al2023 hacks. Native support, native tooling, native cold start optimization.
Why this matters:
Cold starts. Rust binaries are small and start fast. We're seeing cold start times in the sub-10ms range for Rust Lambda functions — compared to 200-500ms for Java and 100-200ms for Python with heavy dependencies. For event-driven architectures where cold starts are the tax you pay for serverless, Rust basically eliminates that tax.
Memory safety without garbage collection. Rust's ownership model gives you memory safety at compile time. No garbage collector means no GC pauses, no unpredictable latency spikes. For Lambda functions that need consistent performance — real-time data processing, API backends with strict SLAs — this is a significant advantage.
Performance per dollar. Lambda bills by duration and memory. Rust functions execute faster and use less memory than equivalent Go, Python, or Node.js functions. In benchmarks, Rust Lambda functions consistently run 2-5x faster than Python equivalents and use 3-10x less memory. When you're running millions of invocations per day, that translates to real money.
The catch: Rust has a steeper learning curve. If your team is all Python, you're not going to rewrite everything in Rust overnight. But for performance-critical paths — the functions that run millions of times, the ones that process real-time streams, the API handlers where every millisecond matters — Rust is now the obvious choice on Lambda.
Start with your hottest path. Port one function. Measure the difference. Then decide how far you want to go.
Cerebras Lands on Bedrock
AWS announced a partnership with Cerebras Systems to bring their inference capabilities to Amazon Bedrock. If you haven't been following Cerebras, here's the short version: they build wafer-scale chips — literally entire silicon wafers as single processors — and they're using a disaggregated inference approach that's producing some eye-popping numbers.
What disaggregated inference means: Traditional LLM inference runs the entire model on one set of GPUs. Cerebras separates the compute and memory tiers, streaming model weights from a memory tier to a compute tier on demand. This sounds like it would be slower, but because of their wafer-scale architecture and massive memory bandwidth, it's actually dramatically faster for certain workloads.
The Bedrock integration means you can now access Cerebras-powered inference through the same Bedrock API you're already using. No new SDK. No new deployment model. You select Cerebras as a provider, point your existing code at it, and get faster inference.
Why this matters for production LLM workloads: Latency and throughput are the two constraints that determine whether an LLM application feels magical or feels broken. If your chatbot takes 4 seconds to start responding, users leave. If your RAG pipeline takes 30 seconds to synthesize an answer, no one will use it. Cerebras on Bedrock gives you another lever to pull when optimizing for speed.
The cost question is still shaking out — Cerebras inference isn't cheap, but for latency-sensitive workloads where speed directly impacts user experience or revenue, the math can work out. Especially if faster inference means you can use fewer provisioned resources.
The Part Where We Talk About Your Bill
Every AWS announcement is also a billing announcement. Let's talk about the money side of this week's news.
S3: The Lifecycle Policies You're Probably Missing
After 20 years, S3's pricing is well-understood. But I still see teams leaving money on the table:
Incomplete multipart uploads. When a multipart upload fails or is abandoned, the parts stay in your bucket. Forever. Silently accumulating charges. Add a lifecycle rule to abort incomplete multipart uploads after 7 days. This one change saves some teams thousands per month.
Intelligent-Tiering monitoring fees. S3 Intelligent-Tiering charges $0.0025 per 1,000 objects for monitoring. If you have billions of small objects, this fee can exceed the savings from tiering. Do the math before enabling it on everything.
Versioning without expiration. If you have versioning enabled (you should), make sure you also have a lifecycle rule to expire old versions. I've seen buckets where 90% of the storage cost was non-current object versions that nobody needed.
Request pricing on archival tiers. Glacier Deep Archive storage is $0.00099/GB/month. Beautiful. But a GET request costs $0.01 per 1,000 requests, and retrieval is $0.02/GB for standard retrieval. If you're frequently accessing archival data, the request costs can dwarf the storage savings. Use S3 Storage Lens to understand your actual access patterns.
Route 53: The Query Pricing Surprise
Route 53 charges $0.40 per million queries for standard hosted zones. That sounds cheap until you realize how many DNS queries a busy application generates. A single page load can trigger 10-20 DNS lookups (your API, your CDN, your analytics, your fonts, third-party scripts). At 10 million page views per month, you're looking at 100-200 million DNS queries.
The tip: Use longer TTLs where you can. If your records don't change often, a 300-second TTL instead of a 60-second TTL reduces your query volume by roughly 5x. That's real savings at scale.
Lambda: Provisioned Concurrency Is a Trap (Sometimes)
Rust's fast cold starts make this less relevant, but it's worth saying: provisioned concurrency is the most expensive way to solve a cold start problem. At $0.0000041667 per GB-second of provisioned concurrency, a 512MB function provisioned 24/7 costs about $54/month — before it processes a single request.
If you're considering provisioned concurrency, first ask: Can I reduce cold starts by optimizing my function? Smaller packages, lazy initialization, lighter dependencies? Can I use Rust (now that it's natively supported)? Can I keep the function warm with a scheduled ping?
Provisioned concurrency is the right answer sometimes. But it should be the last answer, not the first.
The Bottom Line
S3 turning 20 is a reminder that the best infrastructure is the kind you stop thinking about. Route 53's global resolver is quietly making multi-region easier. Rust on Lambda is going to change the calculus for performance-sensitive serverless workloads. And Cerebras on Bedrock gives you another option when LLM latency is the bottleneck.
The common thread? AWS keeps getting better at letting you focus on your application instead of your infrastructure. But "better" doesn't mean "cheaper by default" — you still have to be intentional about how you use these services.
That's the week in AWS. See you next Wednesday.
---
Related reads from the archive:
- 5 AWS Cost Mistakes We See Every Week
- Reserved Instances vs. Savings Plans: The 2026 Decision Guide
- When to Use Serverless (And When to Run)
AWS Wednesday is a weekly cloud recap from Edwards Consulting Group. Questions about your AWS setup? We help businesses get this right. [Get in touch](/contact).