DOT: Data-Oriented Transfer
DOT is a transfer service that handles the heavy lifting of data
transfer on behalf of client applications such as HTTP, SMTP, or
custom applications. DOT centralizes the functions of data transfer so that
innovations in transfer techniques can apply both to newly developed
applications and to legacy applications.
As a core transfer mechanism within DOT, we have created
Similarity-Enhanced Transfer. SET is a multi-source download system that can
retrieve pieces of a target file both from identical sources
(e.g., mirrors of the file)
as well as similar sources (nonidentical files that share bytes with the target
file), thereby outperforming contemporary approaches such as BitTorrent. A key
advantage of SET is that it detects such similar and identical sources using
a constant number of lookups and inserting a constant number of mappings per object into a global lookup table.
Papers and Publications using DOT itself
- An Architecture for Internet Data Transfer. Niraj Tolia, Michael Kaminsky, David G. Andersen, and Swapnil Patil. In Proc. NSDI, May 2006, San Jose, CA.
(Slides - PPT)
(This version contains two corrections from the originally published version: The citation for DOA by Walfish et al., was corrected to include the full author list, and a missing "the" was inserted in section 6.3.2. -- Apr 16, 2006.)
- Exploiting Similarity for Multi-Source Downloads using File Handprints. Himabindu Pucha, David G. Andersen, Michael Kaminsky. In Proc. NSDI, April 2007, Cambridge, MA.
- Adaptive File Transfers for Diverse Environments. Himabindu Pucha, Michael Kaminsky, David G. Andersen, and Michael A. Kozuch. In Proc. USENIX Annual Technical Conference, June 2008, Boston, MA.
- Ditto - A System for Opportunistic Caching in Multi-hop Wireless Mesh Networks. Fahad Dogar, Amar Phanishayee, Himabindu Pucha, Olatunji Ruwase, and David Andersen. In Proc. ACM Mobicom, September 2008, San Francisco, CA.
- Efficient Similarity Estimation for Systems Exploiting Data Redundancy. Kanat Tangwongsan, Himabindu Pucha, David G. Andersen, Michael Kaminsky. In Proc. IEEE Infocom, March 2010, San Diego, CA.
Related Publications in Data Transfer
- Efficiency through Eavesdropping: Link-layer Packet Caching. Mikhail Afanasyev, David G. Andersen, and Alex C. Snoeren. In Proc. 5th USENIX NSDI, April 2008, San Francisco, CA.
- On Application-level Approaches to Avoiding TCP Throughput Collapse in Cluster-Based Storage Systems. Elie Krevat, Vijay Vasudevan, Amar Phanishayee, David G. Andersen, Gregory R. Ganger, Garth A. Gibson, and Srinivasan Seshan. In Proc. Petascale Data Storage Workshop at Supercomputing'07, Nov. 2007, Reno, NV.
- Measurement and Analysis of TCP Throughput Collapse in Cluster-based Storage Systems. Amar Phanishayee, Elie Krevat, Vijay Vasudevan, David G. Andersen, Gregory R. Ganger, Garth A. Gibson, and Srinivasan Seshan. In Proc. USENIX Conference on File and Storage Technologies, Feb. 2008, San Jose, CA.
Software
People
Support
DOT is supported by a grant from the Carnegie Mellon
CyLab,
by Intel Research,
and by CAREER award CNS-0546551 to David Andersen from the
National Science Foundation.