What is a CDN and Why Do I Need One?
Content Delivery Networks (CDNs) are systems of servers that deliver your content from the edge of the Internet. What exactly does that mean? If the Internet is a cloud how can it have an edge? Hopefully after reading this you’ll understand what CDNs are and how they can improve your customer’s experience.
The Internet has been very successfully marketed as a “cloud.” When I think of the Internet cloud, I don’t think of soft puffy clouds, but a dark Seattle winter day where the sky is one giant all-encompassing, all-connected cloud. Since everything is connected, it is easy to visualize all the data coming together and getting it to customers (in my mental picture it’s through a never ending drizzle of data).
The Space Needle in the clouds.
In reality, this is not really how the Internet works. While our data may not be hosted at OUR physical location, all of our data is really on servers that have a discrete physical location(s) somewhere on the terrestrial earth. Where your servers are physically located has a direct impact on how long it takes for data to get from your server to your customers.
Let’s talk physics (I promise it’ll be easy). Light travels through space at 300 million meters per second. Since the “tubes” of the Internet are primarily fiber optic cables – your data is travelling at the speed of light. Now, light is moderately slower in glass – clocking at (only) 200 million meters per second. At that speed, data can circumnavigate the earth 5 times a second. If your servers are in New York, it takes 23ms for data to reach me in Seattle (round trip time or RTT is 46ms).
Every time I connect to your server, there are at least 4 connections that have to be made (for those curious these are: DNS lookup, TCP Handshake, TLS (security), and the initial HTTP request). Four round trips add up to 184ms – before ANY data is sent! If I were in Sydney, the RTT to New York is 160ms – so a connection setup from NYC takes over 600ms!
In Comes the CDN (or – you survived the science!)
There is a lot of data on how long customers will wait for a webpage to load, but the basic conclusion is you need to get your content to your customers as quickly as possible (increase in repeat views, customer satisfaction, and SALES). This is where a CDN comes in. A Content Delivery Network places all of your content on servers that are geographically distributed throughout the world. Imagine you are using a CDN with your data (formerly only hosted in NYC), and the file is now mirrored in San Francisco (the RTT to Seattle is now just 13ms). Now the 4 RTTs to Seattle will occur in 52ms – 3.5x faster!
In theory, this all sounds great, but does this data hold in practice? There are several studies measuring one website with CDN caching on or off. But does this hold true for the whole Internet? The HTTPArchive (httparchive.org) is a database of website performance stats for the top 200k desktop and 5k mobile websites. One measurement (only for desktop, unfortunately) is whether the main page HTML is hosted on a CDN. The results are recorded in Redwood City, CA using IE 9 (and this data is from 2/1/2014). Presented below are the median values for the Time to First Bytes (TTFB) and the time it took for the website to start rendering (both measured in milliseconds):
As you can see, sites with a CDN serving the html of the homepage deliver the content 26% faster, and begin rendering content ~20% faster than those without.
Pages also finish loading faster if there is a CDN present. The SpeedIndex is a measurement of how fast the page appears on the screen. For sites with similar numbers of requests, those with at least the first page hosted on a CDN are generally 15-20% faster (I am using the median value of the Speed Index below):
In reality, I believe that CDN performance is BETTER than what is shown above. Many top websites in this database (like Amazon, Microsoft Live and Yahoo) are shown as not using CDNs. Since Amazon’s CloudFront and Microsoft’s Azure are top CDNs, this is certainly skewing the “no CDN” data faster.
CDNs for Mobile Content?
There is concern that mobile performance is not helped as much by CDN delivery, as the absolute percentage of improvement is lower. Ilya Grigorik of Google has posted an excellent explanation on the latency in mobile, and how CDNs really do help your mobile performance.
Getting your content to your customers as quickly as possible has been proven to be good for business. Tools like AT&T’s Application Resource Optimizer (ARO) help your developers figure out how they can make their code run faster and more efficiently. But, as the data above shows, one simple way to get your content to your customers as quickly as possible would be to host your data on a CDN “at the edge of the cloud.”