DNS Provider Concentration Risk: Why Having All Your Eggs in One Basket is Dangerous
October 21, 2016. Dyn, a major DNS provider, suffered a massive DDoS attack. Twitter, Netflix, Reddit, GitHub, Spotify, and hundreds of other major sites went offline simultaneously.
Why? They all depended on the same DNS provider.
DNS concentration risk is the hidden vulnerability that can take down your entire online presence in one outage. Understanding it means understanding why some of the world's most sophisticated technical organizations went dark for hours because of a single point of failure they hadn't fully appreciated.
What Is DNS Concentration Risk?
DNS concentration risk occurs when multiple domains or critical services depend on the same DNS infrastructure. The vulnerability lives in a dependency chain that most people never think about. When a user types your domain name, their DNS resolver asks your authoritative nameservers for the IP address. Your nameservers respond with that IP, and then the user connects to your server.
If your nameservers can't respond, nothing works. Your servers can be running perfectly, your applications fully functional, your database responsive - but if DNS fails, users can't reach any of it. The domain name that defines your entire online identity simply stops working.
Concentration becomes dangerous when all your domains use the same DNS provider, when all your critical services share DNS infrastructure, or when your DNS provider itself has a single point of failure. Many organizations check all three boxes without realizing it.
The Dyn Attack: A Case Study
The 2016 Dyn attack illustrated concentration risk at scale. The Mirai botnet, assembled from compromised IoT devices like cameras and DVRs, launched a massive distributed denial of service attack against Dyn's infrastructure. As Dyn's authoritative nameservers became unreachable under the flood of malicious traffic, any domain using those nameservers stopped resolving. The sites themselves were technically operational - the servers were up, the applications were running - but nobody could reach them because DNS queries went unanswered.
The list of affected companies read like a who's who of the internet: Twitter, Netflix, Reddit, GitHub, Spotify, SoundCloud, Airbnb, PayPal, and hundreds more. These weren't small companies with inadequate infrastructure budgets. They were some of the most technically sophisticated organizations in the world. They had chosen Dyn specifically for reliability and performance. What many of them hadn't done was implement secondary DNS providers, creating a single point of failure that affected them all simultaneously when that single provider went down.
The business impact was measured in hundreds of millions of dollars in lost revenue across all affected companies. Beyond the immediate financial hit, the attack caused lasting reputation damage and customer trust erosion. The operational chaos of responding to a complete outage while unable to use many of the tools typically available for incident response compounded the problem.
Assessing Your DNS Concentration
Understanding your exposure starts with inventorying your DNS configuration. For every domain you own, document what nameservers are configured, who operates those nameservers, and what the underlying infrastructure looks like. A typical inventory might reveal that your main product domain, your marketing site, and your API all use the same DNS provider. If that provider has issues, all three go down together.
Once you have that inventory, identify your single points of failure. Ask yourself whether all your domains use the same provider, whether that provider has geographic redundancy, what their track record for uptime looks like, and whether you have any secondary DNS configured. For each potential failure scenario, evaluate the impact: if this DNS provider goes down, what services are affected, what's the business cost per hour of downtime, and how quickly could you recover?
DNS Redundancy Strategies
Secondary DNS Provider
The most robust approach runs two DNS providers simultaneously, with both serving authoritative responses for your domains. You configure your domain with nameservers from both providers. Resolvers can query either one, and if one provider fails, the other responds.
This approach survives single provider failures and provides geographic diversity, since different providers have different data center footprints. You may even see performance benefits as resolvers tend to query the closest available nameserver.
The trade-offs are real, though. Managing two providers is more complex than managing one. You must keep both providers in sync, which means either manual updates in two places or automated tooling to ensure consistency. The cost is higher, both in terms of provider fees and operational overhead.
Provider Diversity Across Domains
Rather than running multiple providers for each domain, you can use different providers for different domains. Your main website might use Cloudflare, your backup site Route 53, and your status page NS1.
This approach ensures not all services fail simultaneously. Critically, your status page can remain operational to report on main site outages - something impossible if the status page shares DNS infrastructure with the services it monitors. Partial continuity during incidents helps maintain customer communication and reduces support burden.
This strategy works well for organizations with multiple independent properties, for status pages that should never share DNS with main services, and for disaster recovery systems that need to operate independently.
Self-Hosted Plus Provider
Running your own nameservers alongside a commercial provider gives you control over part of your infrastructure while maintaining commercial-grade redundancy. Your domain might list two self-hosted nameservers plus one nameserver from a commercial provider.
You maintain control over your own infrastructure and aren't completely dependent on any provider's decisions. You can optimize for your specific needs and use cases. The downside is significant operational burden. Running authoritative DNS reliably requires genuine expertise, and you're responsible for the security, availability, and performance of your self-hosted infrastructure.
Anycast Provider with Multiple Points of Presence
Even with a single provider, choosing one with globally distributed anycast infrastructure provides significant resilience. Anycast means a single IP address is announced from multiple locations around the world. Users route to the nearest or fastest instance automatically. Local failures at individual data centers don't affect global availability.
Major anycast DNS providers include Cloudflare, AWS Route 53, Google Cloud DNS, NS1, and DNSimple. Their globally distributed infrastructure means an attack or outage affecting one region doesn't necessarily affect users routing to other regions.
Special Considerations
Status Pages
Never host your status page on the same infrastructure as your main service. If everything shares DNS and your DNS provider fails, your main application goes down and your status page goes down with it. Customers have no way to check what's happening, and your support team gets overwhelmed with inquiries from people who can't even find your status page.
The better approach uses a different DNS provider for your status page, or employs a third-party status service like Statuspage or Better Uptime that operates on completely independent infrastructure.
API Endpoints
APIs need even higher reliability than websites because customers' applications depend on them. When your API goes down, their products fail, and you damage business relationships that took years to build. Secondary DNS is nearly mandatory for production APIs. Consider using a dedicated domain for API endpoints, separate from your website, with a different DNS provider than your main marketing properties.
Email Delivery
A DNS outage is simultaneously an email outage. When your MX records become unreachable, nobody can email you, and your outbound email may fail SPF and DKIM validation checks when receiving servers can't verify your DNS records.
Email may be more critical to your operations than your website. Recovery takes longer due to the interaction between DNS TTLs and email retry mechanisms. Business communication halts completely, affecting sales, support, and internal coordination.
Monitoring for DNS Issues
Even with redundancy, continuous monitoring of your DNS health catches problems before they become catastrophic.
Monitor nameserver availability by checking whether each of your nameservers can be reached, measuring response times from multiple geographic locations, and tracking query success rates. Monitor record accuracy to ensure records return expected values and that all nameservers return consistent data. Watch for propagation issues where changes don't reach all resolvers as expected.
Subscribe to your providers' status pages, monitor their social media accounts, and join any communication channels they offer for service alerts. When your provider knows about a problem, you should know too.
Configure immediate alerts for nameserver unreachability, records returning incorrect values, or significant latency increases. Aggregate alerts for intermittent failures, minor latency changes, and scheduled maintenance to avoid alert fatigue while still maintaining visibility.
Building Your DNS Resilience Plan
Start by documenting your current state. Create an inventory of all domains and their DNS configuration, identify concentration risks where multiple domains share infrastructure, and calculate the potential business impact of each failure scenario.
Prioritize by criticality, recognizing that not every domain needs the same level of protection. Critical domains that need redundancy include your primary business domain, API endpoints, customer-facing applications, and anything that directly generates revenue. Important domains like marketing sites, internal tools, and development environments should be monitored closely but may not require full redundancy. Standard domains like test properties, parked domains, and low-traffic sites need basic monitoring but minimal additional investment.
For critical domains, implement redundancy by adding a secondary DNS provider, testing failover regularly, and keeping configurations synchronized. Set up monitoring that checks from multiple external locations, monitors your internal DNS infrastructure if applicable, and tracks provider status.
Create a response playbook documenting what to do when primary DNS fails, contact information for your providers' emergency support lines, steps to accelerate failover to secondary providers, and a communication plan for notifying stakeholders during incidents.
The Cost-Benefit Analysis
The cost of secondary DNS is modest. Expect to pay $5-50 per month for a secondary provider, plus time to set up and maintain synchronization, and occasional debugging when propagation issues arise.
The cost of a DNS outage is substantial. You face revenue loss for every minute of downtime, customer trust damage that takes months to rebuild, potential SLA violations with contractual penalties, operational chaos as teams scramble to respond, and possible customer churn as users abandon services they can't rely on.
For any business-critical domain, secondary DNS pays for itself many times over. The question isn't whether you can afford redundancy - it's whether you can afford not to have it.
The Bottom Line
DNS concentration risk is invisible until it becomes catastrophic. The Dyn attack proved that even the best DNS providers can fail, and when they do, everyone who depends on them fails together.
Assess your concentration risk. Add redundancy where it matters. Monitor continuously. The few hundred dollars a year for secondary DNS is trivial compared to the cost of a complete outage.
Don't let your DNS be your single point of failure.
Recommended Reading
- The True Cost of a Forgotten Domain - Understanding downtime costs
- Building a Domain Health Checklist - Complete domain audit
- Free Uptime Monitoring - Monitor your infrastructure
Morten Pradsgaard is the founder of exit1.dev — the free uptime monitor for people who actually ship. He writes no-bullshit guides on monitoring, reliability, and building software that doesn't crumble under pressure.