Exploiting the Non-Determinism and Asynchrony of Set Iterators to Reduce Aggregate File I/O Latency

David C. Steere

 
 

Abstract

A key goal of distributed systems is to provide prompt access to shared information repositories. The high latency of remote access is a serious impediment to this goal. This paper describes a new file system abstraction called dynamic sets - unordered collections created by an application to hold the files it intends to process. Applications that iterate on the set to access its members allow the system to reduce the aggregate I/O latency by exploiting the non-determinism and asychrony inherent in the semantics of set iterators. This reduction in latency comes without relying on reference locality, without modifying DFS servers and protocols, and without unduly complicating the programming model. This paper presents this abstraction and describes an implementation of it that runs on local and distributed file systems, as well as the World Wide Web. Dynamic sets demonstrate substantial performance gains - up to 50% savings in runtime for search on NFS, and up to 90% reduction in I/O latency for Web searches.