Currently, Blake Jones (blakej@foo.caltech.edu), and John Langford (jl@crush.caltech.edu) are interested in working on it. email either of us if you are interested in working on it.

The basic idea is to allow distributed files in a manner superior to NFS. Improvements over NFS:
1. local caching of files.
2. Strong security
3. More customizability.

The first improvement, local caching of files could have a large impact in several ways. Pulling binaries down over NFS is a pretty poor option generally, because as soon as the program is done executing, no local record will remain. This means it will need to be pulled across the net at the next execution. One could imagine an operating system that consisted of local configuration files and a cacheing NFS, for all system programs... The disk space required for caching should be somewhat lower than that required for a full OS distribution. In addition, the user would have access to the full distrubition, even if the full distribution exceeded the capacity of the hard drive. There are also several settings you could imagine involving cache updates. Options include:
1. rabid updating, as soon as the file changes on the server, attempt to update the cache on all machines which currently have the file.
2. at use, as soon as the file is used by the local OS, if it is cached, check to see if it has changed.
3. second use, whenever a file is used, spawn a second process that checks to see if the file has changed, and update it after this use has finished. 4. decaying time updates.
5. no updating.

Strong security is necessary in light of the potential use as a core piece of operating system. It would be nice to find a good way to include crypto.

Customizability would also be quite useful. Currently, you can NFS export directories and all of their contents. Under this local caching scheme, there is no reason why you wouldn't allow full local versions of the files. The scenario would be that the server would have a base configuration. Then, the clients could go alter there local version of the file, setting it to never be updated. This would allow local system configuration.

Furthermore, there is no reason why this idea isn't scalable. A client could be a server to other clients, although dependency loops should be avoided.

You can imagine a scheme for memory:
1 registers
2 L1 cache
3 L2 cache
4 RAM
5 Virtual RAM (swapping)
6 Hard disk
7 Net program servers (practical with a caching NFS system)

The actual amount of functionality we implement is quite dependent on what tools we can find and help we can get. Maximum advantage of existing software will be taken in implementing this. Blake has come up with a way to implement (at least a large portion of) the functionality working with existing NFS servers under Linux.

Does it seem like a good project? Do you have any knowledge as far as what is doable? What to do and not do? I'd like to put a fairly high priority on producing something that will be useful at large outside of class. Linux/gnu seems like the choice operating system to work with because of availability of code.


source jl@crush.caltech.edu index
data_model
SQL