This dissertation shows that it is practical to support disconnected operation for a fundamental system service: general purpose file management. It describes the architecture, implementation, and evaluation of disconnected file service in the Coda file system. The architecture is centered on the idea that the disconnected service agent should be one and the same with the client cache manager. The Coda cache manager prepares for disconnection by pre-fetching and hoarding copies of critical files; while disconnected it logs all update activity and otherwise emulates server behavior; upon reconnection it reintegrates by sending its log to the server for replay. This design achieves the goal of high data availability -- users can access many of their files while disconnected, but it does not sacrifice the other positive properties of contemporary distributed file systems: scalability, performance, security, and transparency.
The system has been seriously used by more than twenty people over the course of two years. Both stationary and mobile workstations have been employed as clients, and disconnections have ranged up to about ten days in length. Usage experience has been extremely positive. The hoarding strategy has sufficed to avoid most disconnected cache misses, and partitioned data sharing has been rare enough to cause very few reintegration failures. Measurements and simulation results indicate that disconnected operation in Coda should be equally transparent and successful at much larger scale.
The main contributions of the thesis work and this dissertation
are the following: a new, client-based approach to data availability that
exploits existing system structure and has special significance for mobile
computers; an implementation of the approach of sufficient robustness that
it has been put to real use; and analysis which sheds further light on
the scope and applicability of the approach.