news

AWS CloudFront Outage Costs Salesforce $47M — SLA Pays $470K

David Brooks-February 13, 2026-8 min read

AWS CloudFront architecture diagram showing cascade failure from Route 53 DNS to dependent services like Salesforce and Claude

Photo by Taylor Vick on Unsplash

Key takeaways

On February 10, 2026, AWS CloudFront DNS failed for 134 minutes. Salesforce, Claude, Adobe, and 20+ platforms went dark. Salesforce lost $47 million. AWS will issue $470,000 in SLA credits. The customer absorbs the other 99%. Here's my take: this is the outage that exposes the cloud SLA illusion.

The SLA illusion: AWS pays $470K, Salesforce loses $47M

AWS CloudFront promises 99.99% uptime. The SLA says if monthly uptime falls between 99.0% and 99.9%, you get a 10% credit on your CloudFront spend. On February 10, 2026, CloudFront was down for 134 minutes. That's 99.69% monthly uptime. Salesforce qualifies for the 10% credit.

Estimating conservatively, Salesforce spends $4.7 million per month on CloudFront (based on typical enterprise CDN contracts and public traffic data). The 10% SLA credit: $470,000.

But Salesforce didn't lose $470K during the outage. According to Salesforce's Q4 2025 10-K, the company generates $73.2 billion annually — $8.35 million per hour on average. During business hours, when this outage hit, that number spikes 2.5x due to enterprise usage concentration. Over 2 hours and 14 minutes, Salesforce lost approximately $47 million in unprocessed transactions, failed deals, and SLA breaches with its own customers.

$47 million in losses. $470,000 in compensation.

The AWS SLA covers 1% of the actual cost. The customer eats the other 99%. And this is Salesforce — one of the largest companies in the world with the financial cushion to absorb the hit. For a startup or mid-sized SaaS, an outage of this magnitude can be terminal.

Cloud SLAs aren't insurance against losses. They're discounts on next month's bill. The illusion of enterprise protection.

February 10, 2026: The 134-minute outage that exposed shared fate architecture

At 18:47 UTC on February 10, 2026, AWS CloudFront stopped resolving DNS. Over the next 134 minutes, Salesforce CRM, Anthropic's Claude API, Adobe Creative Cloud, Discord, Twitch, Slack integrations, Zendesk, and a list of 20+ platforms that depend on AWS infrastructure went dark.

This wasn't an isolated CloudFront failure. The cascade dragged down 8 AWS services: Route 53, API Gateway, WAF, Shield, Certificate Manager, Secrets Manager, and Systems Manager. According to ThousandEyes data, 87% of CloudFront's 600+ edge PoPs went offline. This wasn't a regional issue. It was systemic.

AWS published a terse statement on its Health Dashboard acknowledging "DNS resolution issues affecting CloudFront." Two hours and 14 minutes later, at 21:01 UTC, service was restored. No detailed technical explanation. No public post-mortem. Just the standard "we have identified and mitigated the issue."

The numbers tell a different story.

AWS Service	Role in CloudFront	Impact of DNS Failure
Route 53	DNS resolution for distributions	No DNS, no traffic
Certificate Manager	SSL/TLS validation	Certificates unreachable, connections rejected
API Gateway	Backend for serverless apps	Endpoints unreachable
WAF	Malicious traffic filtering	Rules unapplied without requests

AWS's multi-AZ architecture protects against hardware or individual data center failures. It doesn't protect against failures in global control plane services. Route 53, API Gateway, Certificate Manager: all are shared global dependencies. A failure in Route 53 doesn't stay contained in us-east-1. It affects all regions simultaneously because DNS is, by design, global.

Why AWS redundancy failed: Global control plane dependencies

CloudFront depends on Route 53 to resolve DNS records for its distributions. When Route 53's authoritative nameservers failed health checks, the CNAME resolution chain broke. CloudFront edge PoPs started rejecting requests because they couldn't validate SSL certificates hosted in Certificate Manager — which also uses Route 53 for internal lookups.

This is what AWS calls "shared fate" architecture. When a global control plane service fails, the blast radius is unlimited. Multi-AZ doesn't help. Regional failover doesn't help. The architecture multiplies the impact.

Cloudflare, by contrast, operates with isolated failure domains. If one PoP goes down, the others stay operational. AWS, with its global shared services (Route 53, Certificate Manager), multiplies the blast radius of a single point of failure.

This wasn't the first time. In December 2021, a failure in us-east-1 took services down for 7 hours. In February 2017, a typo in an S3 script caused 4 hours of downtime. In November 2020, Kinesis dragged multiple services offline for 8+ hours.

The pattern is clear: when an AWS control plane service fails, the cascade is inevitable.

The multi-cloud dilemma: $2.8M setup vs. actuarial risk

The obvious solution is multi-cloud: use Cloudflare as a secondary CDN, maintain external DNS (e.g., NS1 or Dyn), configure circuit breakers that automatically redirect traffic on failure. Netflix does it. Spotify does it. Any platform with critical uptime should do it.

But according to InfoQ analysis of enterprise multi-CDN architectures, the initial setup cost is $2.8 million for a mid-sized organization: configuration migration, duplicate SSL certificates, DNS failover logic, disaster recovery testing, SRE team training.

And the ongoing overhead is 15-20%: duplicate CDN costs (even though Cloudflare is 40% cheaper than CloudFront, you're still paying two providers), operational complexity (two dashboards, two log systems, two update cycles), and additional failover latency (typically 30-90 seconds for DNS propagation).

For a startup or growth-stage SaaS, $2.8M is prohibitive. They prefer to accept the actuarial risk: the probability of a 2+ hour outage is low (4 in 5 years = ~16% annually). The expected cost of downtime ($47M × 16% = $7.5M annualized for a Salesforce-sized company, much less for a startup) doesn't justify the immediate capex of multi-cloud.

Let's be real: AWS has created a model where 87% of customers choose to accept concentration risk because the alternative is financially unviable until they're already large enough to afford redundancy.

And when they reach that size, vendor lock-in (Lambda functions, RDS configs, IAM policies, VPC peerings) makes migration even more expensive. It's a perverse incentive: AWS profits more the more you depend on them, and that dependency makes it progressively costlier to leave.

Until the DNS fails and you lose $47 million in an afternoon. Then you realize the SLA was never insurance. It was just marketing with fine print.

4 major outages in 5 years: This is the pattern, not the exception

AWS controls 32% of the global cloud infrastructure market according to Gartner (Q4 2025). One in three internet services depends on AWS. When AWS falls, it drags down a third of global cloud computing with it.

And AWS falls more often than its marketing suggests. According to the AWS Post-Event Summaries (PES) archive, in the last 5 years there have been 4 outages longer than 2 hours:

November 2020: Kinesis failure in us-east-1 takes down Cognito, CloudWatch, Alexa. 8+ hours of downtime.
February 2017: Typo in S3 command deletes control plane servers. 4 hours offline.
December 2021: us-east-1 failure affects Lambda, RDS, ECS. 7 hours of chaos.
February 2026: This CloudFront outage. 2 hours 14 minutes.

And the competition? Google Cloud Platform had 0 outages longer than 1 hour in 2024-2025 according to public incident records. Cloudflare, with its 1.1.1.1 DNS network decoupled from its CDN, also reports 0 multi-hour outages in the same period.

This isn't an exceptional event. It's systemic risk architecture operating as designed.

Was this helpful?

Frequently Asked Questions

How long did the AWS CloudFront outage last in February 2026?

The outage lasted 2 hours and 14 minutes, from 18:47 UTC to 21:01 UTC on February 10, 2026. It affected CloudFront and 7 additional AWS services including Route 53, API Gateway, and Certificate Manager.

Which platforms went down during the AWS outage?

Over 20 platforms were affected, including Salesforce CRM, Anthropic's Claude API, Adobe Creative Cloud, Discord, Twitch, Slack integrations, and Zendesk. 87% of CloudFront's edge locations (522 of 600) went offline.

Does the AWS SLA cover business losses during an outage?

No. The CloudFront SLA (99.99% uptime) only offers a 10% credit on monthly service spend if uptime falls to 99.0-99.9%. For Salesforce, this meant $470K in credits versus $47M in actual losses — the SLA covers just 1% of the financial impact.

Why did AWS redundancy fail during this outage?

The failure occurred in global control plane services like Route 53 (DNS) that are shared dependencies across multiple services. When Route 53 stopped resolving DNS, CloudFront, API Gateway, Certificate Manager, and other services cascaded into failure. AWS multi-AZ architecture protects against hardware failures, not shared fate failures in global services.

How much does it cost to implement multi-cloud to avoid these outages?

According to InfoQ analysis, a multi-CDN architecture (e.g., Cloudflare + AWS) costs approximately $2.8 million in initial setup for a mid-sized organization, plus 15-20% ongoing operational overhead. This includes duplicate SSL certificates, DNS failover logic, disaster recovery testing, and SRE team training.

Sources & References (8)

The sources used to write this article

All sources were verified at the time of article publication.

Written by

David Brooks

Veteran tech journalist covering the enterprise sector. Tells it like it is.

#aws#cloudfront#outage#salesforce#claude#dns failure#sla#multi-cloud#enterprise

news

8 min read-Feb 13, 2026-

Anthropic raises $30B at $380B valuation—but 50% of revenue vanishes in GPU costs

GIC leads the largest private round of the year, valuing Anthropic at $380B. The press releases talk confidence. The filings reveal a cost structure that burns 50 cents of every dollar before paying a single engineer.

Developer waiting for GitHub Actions build to complete while checking watch

news

8 min read-Feb 13, 2026-

GitHub Actions Called 'Internet Explorer of CI': Teams Lose 20% Productivity

Actions owns 75% of public repos but barely 8% of enterprise. The gap reveals an uncomfortable truth: teams with budgets choose differently. The real cost of those slow builds.

Visual Studio Code screen displaying CVE-2026-21510 vulnerability alert with malicious extension code in background

news

8 min read-Feb 13, 2026-

Microsoft Patch Tuesday: 6 Zero-Days Hit 30M+ Developers

February 2026 Patch Tuesday patches 58 vulnerabilities, including 6 actively exploited zero-days. CVE-2026-21510 exposes 30M+ developers via GitHub Copilot RCE. Global patching cost: $5.76 billion in downtime vs $47K per RCE incident if you wait.

The SLA illusion: AWS pays $470K, Salesforce loses $47M

February 10, 2026: The 134-minute outage that exposed shared fate architecture

Why AWS redundancy failed: Global control plane dependencies

The multi-cloud dilemma: $2.8M setup vs. actuarial risk

4 major outages in 5 years: This is the pattern, not the exception

Frequently Asked Questions

How long did the AWS CloudFront outage last in February 2026?

Which platforms went down during the AWS outage?

Does the AWS SLA cover business losses during an outage?

Why did AWS redundancy fail during this outage?

How much does it cost to implement multi-cloud to avoid these outages?

Sources & References (8)

David Brooks

Related Articles

Anthropic raises $30B at $380B valuation—but 50% of revenue vanishes in GPU costs

GitHub Actions Called 'Internet Explorer of CI': Teams Lose 20% Productivity

Microsoft Patch Tuesday: 6 Zero-Days Hit 30M+ Developers