This article is a comparative study of TCP initcwnd values for Content Delivery Networks (CDNs ) that keeps increasing over the years. Internet is ever growing; the global average network speed has evolved to an extent that we had never imagined a decade ago. Along the same lines, the web space; including the web objects are also ever growing in size. Over the course of our experiments, we analyzed various CDNs initcwnd value and how they have changed over the years. Background to concepts involved is presented before moving towards our results and observations.
What’s TCP Congestion Control?
There are four intertwined algorithms for network congestion control, viz. Slow-start, congestion avoidance, fast re-transmit, and fast recovery. Let me explain the four algorithms in brief and then we can try to understand the importance of Congestion window in determining the performance of the network as a whole.
Note: The understanding of various definitions related to TCP is required to efficiently understand this article. Nevertheless, two important definitions are explained below.
Cwnd (Congestion Window) is a TCP state variable that is used to control the amount of data that can be sent into the network before receiving an acknowledgement. Also, rwnd (Receiver Window) is a TCP variable that tells the amount of data that receiver side can receive. Both in conjunction help in TCP congestion and flow control.
Slow start is an algorithm which gradually increases the amount of data injected into the network and finds the networks’ optimal data carrying capacity. It negotiates with the connection between the sender and the receiver to determine the optimal window size. The congestion window(cwnd) is the sender side limit whereas the receiver’s advertised window(rwnd) is the receiver side limit. Initially, the sender sends the data with small congestion window (initcwnd) value. And the sender keeps increasing the window value as long as receiver keeps acknowledging every packet or either, ssthresh limit is reached. When this happens, congestion avoidance takes over.
During slow start, TCP implementation increases the cwnd value as per below equation.
cwnd += min(N, SMSS)
N – number of previously unacknowledged bytes acknowledged in the incoming ACK.
SMSS – Sender’s Maximum segment size.
Congestion avoidance takes over when the cwnd value becomes greater than the ssthresh value set initially. During congestion avoidance phase, cwnd is incremented by roughly 1 segment per RTT. Following equation may be used for the same.
cwnd += SMSS*SMSS/cwnd
Fast re-transmit and Fast recovery helps increase the data rate sending by using the concept of triple duplicate acks. When a segment is received out of order, then the receiver immediately sends the duplicate ack to the sender. When the sender receives triple dup acknowledgement of the same packet, it immediately sends the segment(s) lost in between without waiting for the re-transmission timer to get expired.
How does a CDN Work?
CDNs cache the static (as well as dynamic) content in various geographical locations (using Edge caches or POPs) across a region or worldwide, thereby bringing resources closer to users reducing the round trip time. So, when a user sitting in India tries to access an object, the nearest server can send the same to the user reducing the RTT, thereby reducing the page load time and improving the efficiency.
Work Flow: Consider an example with the below diagram. Skyscanner(China) with its domain name tianxun.com has signed up with a CDN. ChinaCache (the CDN provider for Skyscanner China) provides them with the CDN URL e.g res.tianxun.com. Then the developer configures to load the static contents from the CDN URL. So, when the User A who is sitting in China tries to access the tianxun.com, the browser requests any objects using the CDN URL that is linked. Then the CDN distributes the requests to the nearest CDN server and the content is served from that server sitting at location A, which is nearest to it. Similarly, when the User B tries to access Skyscanner, it is served from the CDN server sitting at location D as it is nearest to it. This helps improve the website performance, reduces load on the origin server, thus improving overall customer usage experience.
CDNs uses various technologies including Anycast routing to distribute the requests.
Anycast is a networking technique of routing and addressing policies to determine the most optimum path between the client requesting a resource from the server(s), which are geographically distributed. Anycast works by advertising the same IP address from servers which are hosted at multiple locations. The network layers’ dynamic routing helps route the packet to the nearest server (lowest number of hops). We can call it as unicast to the nearest address, as only one receiver is selected (nearest receiver) from all the available ones.
We know that when the client sends the request with the anycast ip address in its packet header, it’s the router who does the job of selecting the best (nearest) destination from the multiple servers running the same service. Anycast is implemented using the Border Gateway Protocol (BGP). BGP makes use of Autonomous system numbers for its inter-domain routing. To deeply understand the implementation of anycast we need to understand the working of inter-domain routing and BGP. To understand more about this refer to BGP rfc 4271.
The main reason why TCP Anycast became famous among the CDNs is that it helps reduce the latency which in turn speeds up the page load time which enhances the performance of the website. And this is the main reason why CDNs became popular.
Importance of TCP Initial congestion window
The TCP Initial congestion window or initcwnd is used during the start of a TCP connection. During the start of a HTTP session, when a client requests for a resource, the servers initcwnd determines how many data packets will be sent during the initial burst of data.
If the initcwnd values is large, then there will be fewer RTTs required to download the same file. But we cannot set initcwnd to a huge value as the network environment and the routers also has the limitation of having limited buffers. If exceedingly large values are set, it may lead to router buffer overflows, packet loss, packet re-transmissions. So, we need to set an optimal value for the initcwnd which is directly proportional to the network bandwidth.
As most modern web transactions are short lived and global network average speeds have evolved dramatically over the time since the introduction of this tcp parameter, RFC 6928 was introduced after the research publication by Google in 2010 to increase the default value of initcwnd to 10 segments(a max of 14600 Bytes).
Modern Web and Network space
Today, The internet is dominated by web traffic running on top of short lived TCP connections. A large proportion of Internet flow complete even before the slow-start algorithm exits.
The web space is ever growing and it is growing at an exponential rate. The web object have significantly become larger. Only a small percentage of web objects which is roughly 10% of all google responses can fit in the space of 4KB.
Alongside, the network space has also evolved both in terms of speed and penetration. Ten segments are likely to fit in the queue space available at any broadband access link, even when there are reasonable number of concurrent sessions.
Google published a paper, where they quantified the latency benefits and costs of using larger initcwnd through large scale experimentation. Following this, the major commercial CDNs out there increased initcwnd values in their servers.
Initcwnd settings for major CDNs
Earlier in 2014, CDNPlanet had published a set of results for the initcwnd values of major CDNs. We wanted to see what, if any, had changed in the intervening three years. Using the scripts that CDNPlanet had open-sourced, we figured out the current initcwnd values of some major CDNs. We also observed a couple of interesting behaviors which we will cover in the Observations section.
We set up five EC2 instances in five different geographical regions viz. Oregon, Virginia, Sydney, London and Singapore to cover most parts of the globe. We used these instances to send requests to different websites using different CDNs. We configured and ran the go script of initcwnd checker provided by the turbobytes.
This is the comparative graph that we plotted from the values we collected by our experimentation in Jan 2017 and values that was published by CDNPlanet in Aug 2014. We can see that most of the CDNs have increased their initcwnd values over the time. This can be because of the reasons that we talked about before. The gathered results are also published.
- Most CDNs have increased their initcwnd value. As the average global network bandwidth is continually growing, increasing the initcwnd value will only decrease the round trip times, page load times and overall increasing the performance.
- We gathered the destination servers ip addresses for every request that we made. And observed that, for most CDNs the dest ip remained same even when we requested the resource from different regions (Oregon, Virginia, Sydney, London and Singapore). This is depicted in the following table.
CDN Oregon IP Sydney IP London IP Singapore IP Highwinds 184.108.40.206 220.127.116.11 18.104.22.168 22.214.171.124 MaxCDN 126.96.36.199 188.8.131.52 184.108.40.206 220.127.116.11
We performed another test to confirm whether they are anycast ip addresses. We did traceroute to the same ip address from different regions and observed the interim and last hop region through which the packet has travelled. We noted that the interim and last hop geographical region when the the request originated from different regions were different. And thus, we could conclude that they were anycast ip addresses.
For MaxCDN, the dest ip returned for the requested resource from Virginia, Oregon, Sydney London, Singapore was 18.104.22.168. The above table depicts the interim and last hop region from multiple locations. Studying the table concludes that the ip being advertised by MaxCDN from different regions is an anycast IP. In fact, most of the major CDNs now use anycast ip addressing in their architecture.
- For two CDNs in the list viz. Akamai and ChinaCache, on which we performed the experiments, we observed that, initcwnd values were different for different types of requests. For e.g. Skyscanner china (tianxun.com), Sotheby’s and EnglishCentral China uses ChinaCache as their CDN. And the initcwnd values we obtained were 10, 16 and 20 respectively. Paytm, Flipkart and ESPN uses Akamai as their primary CDN and the values we found was 10, 16 and 32 respectively. So we thought that maybe they are providing initcwnd customization based on their customers. But then we made another interesting observation. For Forbes US the initcwnd was 16 and Forbes India initcwnd was 32 .So even for the same customers but in different regions (having different domains) the initcwnd values differed. Also, for a particular request, all these values were consistent independent of the source location.
There are three main inferences that we can conclude from this whole experimentation.
- CDNs keep updating their initcwnd values over the years as the Network space and Web environment evolves.
- Many CDNs have implemented Anycast IP routing in their architecture for improving speed, reducing latency and increasing the performance of their customers websites.
- Depending on the customer or geographical region some CDNs, e.g. Akamai is able to provide customized initcwnd values.