Anycast is a network service that delivers a datagram to any one server out of a group of servers distributed throughout the network. As originally introduced in RFC 1546, anycast is a way to reach the mirror that is the closest (measuring network hops). However, the closest mirror server is not always the best mirror server. Some applications are more sensitive to server load, or latency between client and server, or throughput between the client and server. In the limit, every application will have a particular set of metrics for determining which mirror server a client should contact.
The goal of our research is to design an anycast protocol capable of taking application-specific metrics into account. We call this application-aware anycast to differentiate our research from the IETF's notion of anycast.
Recently, we conducted a measurement study to gain insight about features of mirrored sites that will help in designing an application layer anycast protocol. Our goal was to answer two questions:
To answer these questions, we used 9 clients to fetch data from 3 mirrored web sites composed of 47 web servers. Over the 3 weeks of the study, almost half a million fetches were performed. Our analysis of the fetches has produced three main findings:
Yes, you too can now play with the data files we have collected. The format of each data file is:
time_of_day URL transfer_time bytes_transferredThe time of day is from the standard Unix clock (in seconds), and probably in UTC as well (perl's localtime() function translates this into a regular data quite well). The URL should be self-explanatory. The transfer time is in seconds. Fetches that failed (either because they timed out or because lynx exited with an error) will have a transfer time of -1. The last field is the number of bytes returned by lynx.
The data files are here (listed by client site):