Store And Forward Communication: UUCP and FidoNet

In the 1980s, the Internet / ARPANET remained primarily a research network connecting universities and large computer-centric companies. Much consumer networking activity focused instead on dial-up BBS (Bulletin Board Systems), accessed by dialing in with a modem, checking email or reading message boards, and then disconnecting.

BBSs were originally stand-alone communities, but as their popularity spread, BBS operators began to seek ways of sharing email and discussion groups between systems. While several proprietary sharing methods sprung up, FidoNet emerged as the most popular, global file, email, and news sharing network. At its peak, it had around 30,000 constituent systems partcipating.

FidoNet worked by having nodes dial in to each other periodically to send and receive messages. These messages were forwarded in a store-and-forward manner through the fidonet until they reached their destination. These dial-up connections were established periodically (once or twice per day) to amortize the cost of expensive long-distance telephone communication. Long-distance telephone connectivity was comparatively expensive -- about $0.25 per minute in 2005 dollars, compared to today's discount rates of $0.05 per minute or less. Many BBS systems operated with a single telephone line, meaning that users could not be dialed in at the same time the BBS was connected to the rest of the network.

Somewhat earlier than the development of FidoNet, the UUCP ("Unix-to-Unix Copy Protocol") system was developed by Mike Lesk at AT&T. UUCP is another store-and-forward file, command, email, news, etc., distribution protocol that can communicate over dial-up modem connections, hard-wired connections, or over the Internet. One of the major features of the initial UUCP system was its use of explicit routing, known as UUCP "bang" (!) paths. Email addresses looked something like duke!research!ucbvax!user@mit-ai (meaning, send the mail first to duke, then to their "research" machine, then to the vax at Berkeley, then via the ARPANET to user@mit's AI lab).

The advantages of these store and forward networks was their resilence to extreme disconnection. In both fidonet and UUCP, messages and files could progress through the system in the complete absense of a working end-to-end path between the sender's node and the receiver's. This is in stark contrast to the way most Internet mail systems are configured today, in which mail is delivered over a TCP connection directly from the sender's mail node to the receiver's mail node. There were many downsides, however - explicit addressing in UUCP scaled poorly (see the pathalias paper) and was quite user-unfriendly. FidoNet nodes needed to have a copy of the entire node list to know how to reach each other---imagine trying to scale that to the millions of mail-sending hosts that exist today!

There are few papers that provide overviews of this style of communication and these systems, so the readings for this lecture consist of several Web pointers to informal specs and articles.

Layering

Both UUCP and FidoNet suffer from a somewhat murky layering model and the absense of a "narrow waist" like IP as a way to foster extensibility. UUCP defines a number of underlying protocols for transferring data; these protocols provide things like packe-based flow-controlled error corrected channels (e.g., the UUCP 'g' protocol). Various protocols are designed to handle half and full duplex links with different capabilities for handling eight-bit-clean data, flow control, and so on. In general, this protocol arrangement is messy, and a number of the protocols appear useless.

FidoNet defines, in addition to addressing and routing, an application-layer protocol for mail addressing. FidoNet most commonly used the XMODEM protocol for accomplishing data transfers.

Naming

FidoNet

FidoNet uses a geographically hierarchical addressing scheme. Addresses consist of a Zone, Net, Node, and Point. The zone roughly corresponds to a continent, with Zone 1 representing the US, Canada, and the Caribbean. A net is a geographical region within a zone. A node is a system that can receive phone calls. A point is a "subnode" that can be reached only through a particular node.

A FidoNet address looks like 1:512/666.0 in zone 1, net 512, node 666.

UUCP

Each host in UUCP has a globally unique hostname. The original UUCP network consisted of hosts like ucbvax and mit-ai.

Routing

FidoNet

FidoNet's routing was also based upon geographical proximity, with the primary goal of reducing long-distance charges. The standard form of routing was to go from the node, to a "hub" that connected several nodes, to a region coordinator, to a zone coordinator, and then back down the appropriate hierarchy to reach the recipient. The goal behind this routing was to achieve maximum batching of messages for better compression and more efficient use of telephone time.

Each FidoNet system must maintain a nodelist that identifies every other member system. The nodelist was maintained in a similarly hierarchical manner, with hubs maintaining local mebership information, and so on.

UUCP

Routing in UUCP is performed via a form of source routing, leading to the UUCP bang-path: "foo!bar!ucbvax!user". Over time, as the network complexity grew, this addressing became extremely inconvenient to users. Note, however, that in some views of the world, UUCP hostnames are absolute, not relative: foo!bar!ucbvax!user specifies the exact same user as ucbvax!user. Thus, it is possible to separate the human-visible identifier from the path computation. This distinction has clear drawbacks in terms of requiring global coordination before one can connect to the UUCP network. The absolute vs. relative debate went on for some number of years, before UUCP addressing was obliterated by Internet-style addressing.

Like many forms of pure source-routing, the use of pathalias required that all hosts have an up to date map of the network, or at least as much of the network as they wanted to talk to.

Scheduling

FidoNet handles scheduling by requiring nodes to be available for Fido transfers during "Zone Mail Hour." Most systems would accept mail at other times, but the reserved time ensured that regardless of user activity, the nodes would be able to trasmit and accept network traffic.

The UUCP network specified scheduling only on a per-pair basis. Individual node configurations could specify when to try calling a particular neighbor, but there was no global coordination. In general, UUCP was used on more richly connected hosts.

Building Higher Level Services

UUCP is a generic file copy and command execution protocol. To execute commands, it copies across an 'X.*' file, which is a text file that tells the remote system what to execute. The 'X' file may reference another file that was also copied to the remote system. Mail, for instance, is handled by issuing a remote execution for the rmail command.

"Modern" UUCP

UUCP is still used, though less and less often, in today's Internet. Its primary utility is to permit a domain that is frequently offline to handle its own email. In this model, UUCP is mostly used to connect a single stub network back to a well-connected core.

{Internet} ---- {Gateway} -| uucp |--- stub

In this configuration, the domain example.com would have its DNS hosted by its gateway ISP. The gateway would publish an MX record for example.com pointing to the gateway's own mail server. The mail server would be configured to relay mail to example.com via UUCP. Example's UUCP mail server would login to the gateway periodically and download all of its queued email, and upload any messages it needed to send.

This functionality is available in a somewhat less robust fashion using the (relatively) new ETRN command in sendmail, which causes a mail server to begin sending queued mail.

Addressing, Naming, and Routing

UUCP-style email addresses originally combined addressing, naming, and routing all in one, leading to unfortunate email addresses that were both complex (hosta!hostb!hostc!user) and sender-relative (e.g., another person might see that address as hostx!hosty!hostc!user).

The emergence of pathalias reduced this constraint, transforming UUCP email addresses into a combination of user and host address (hostc!user), leaving it to the mail program to compute the route to the destination. FidoNet addresses also fall into this category, e.g., a user at host 1:170/918.42.

Internet addresses, in contrast, add another layer of indirection, sending mail to user@domain. In this case, domain is a name, not an address -- it has no topological significance, but is human-friendly. Through a lookup in the DNS, the domain is resolved into an address, and the mail is then sent directly to that address. Meanwhile, the routing computations happen at a layer far underneath the mail transfer.

In general, many of the "ugly" features of UUCP and FidoNet have little to do with their operation as store-and-forward networks, but more to do with various architectural issues (e.g., bang paths) that were eventually weeded out of their Internet cousins. Some of these cleaner abstractions, such as a better separation of naming, addressing, and routing, are facilitated by the notion of a constantly well-connected core, but by no means depend on it. For instance, an interesting exercise would be to devise a store-and-forward messaging system that runs on top of Internet protocols, does not assume constant connectivity, but still permits indirect, human-friendly naming that abstracts identity from topology.