15-441 Project 2: Network Layer

Assigned: Monday, Feb 13th, 2006

~~Due: Monday, March 6th, 2006, 11:59pm~~

Due: Thursday, March 9th, 2006, 4:59pm

Lead TAs: Thomas Harris and Mike Cui

Introduction - Logistics - Assignment - Evaluation
Hand-in - Coding Style - Resources

Introduction

While Project 1 gave you experience with the application layer in networking, this project will give you experience with one of the lower layers of the networking stack. Specifically, you will design and implement forwarding and routing in an IP-capable network layer.

Note that Project 3 depends significantly on Project 2. To successfully complete Project 3, you will need the facilities that you implement in Project 2 (forwarding). It is vital that you keep this in mind when choosing your project partner, and while working on the project.

Logistics

This is a group project. You must find exactly one partner for this assignment. The only reason you should not have a partner is if there are an odd number of people in the class and you are left out (in which case contact us). Talk to your neighbors and use the bboards. See Barbara Grandillo, WeH 8018, for access to the sign-up sheet. The deadline for signing up is Wednesday 3/15.

Like Project 1, you must write your code in C. Your code will run on top of a virtual machine that we have supplied. See the Support Code section below.

The project directory for this project is /afs/cs/academic/class/15441-s06/project2, which is referred to as $PDIR in the rest of this document.

Assignment

Support Code

As you know, a network layer is not useful in isolation. At minimum, we need a link layer below the network layer, and transport layer above the network layer. We also need facilities for configuring the network.

In the "real-world", the lower layers (link and physical) would be provided by hardware. To simplify your task in this project, however, we provide a machine simulator which implements these layers and the transport layer as part of a Unix process. To further simplify your task, we also provide a socket library. The socket library, as you will recall from Project 1, is the interface between network applications and the operating system’s networking code.

The relationship between these components is illustrated in the figure above. The TFTP server and TFTP client are included as examples of network applications. The gray box in the machine simulator is the component which you must implement. The white boxes of the simulator are components that we provide.

Details about the machine simulator, including information about how to compile your components with the components that we provide, are given in the simulator handout.

The simulator libraries and header files are under $PDIR/lib and $PDIR/include respectively. For your convenience, we provide you a template for the code that you can start with. This includes Makefile, skeleton definitions of some important structures, skeleton prototypes of some interface functions, etc. All the template files are in $PDIR/template/kernel. Copy the template directory into your working space and get started with programming. $PDIR/utils contains utilities to help you in debugging your code. Read the README file under the directory to learn how to use those utilities.
IP Forwarding

Destination Interface

1.1.1.2 1

1.1.2.2 2

Table 1: Forwarding table at Node 1

Destination Interface

1.1.1.1 1

1.1.2.2 1

Table 1: Forwarding table at Node 2

In the first part of the project, you will implement basic network layer functionalities.The network layer for this project will be modeled after IPv4. The network layer provides an addressing mechanism, and the ability to forward packets between nodes. Your network layer must provide both of these facilities. For addressing, we require that your network layer use IPv4 addresses.

Section 4.1.4 of the text book describes datagram forwarding in IP. The forwarding table entries consist of (destination IP, interface) pair. Note that this is different from the forwarding table entries described in the text book. The prototype for forwarding table entries is given in $PDIR/template/kernel/rtable.h. In the example network given in the simulator hand-out, the forwarding tables at the nodes 1, 2 and 3 are given in Table 1, Table 2 and Table 3 respectively.

Netmasks will be assigned to interfaces.
- For the don't route case, you will not be required to implement greatest-prefix-first routing to get full credit, but you will have to match the target ip address with the network (ip addr + netmask) of the interface. Note that this simplification requires that we won't specify interface networks that overlap.
- For the route case, the routing table won't include netmasks; each address either will or will not be in the table, and if it is in the table it will have a specific interface target. Note that this is a simplification necessitates a route for every host in every router, which is obviously not very scalable.
The network layer for this project will be modeled after IPv4. Your implementation does not need to handle IP fragmentation, multicast, IP header options, and type-of-service (ToS). You are, however, responsible to correctly set, handle, and verify IP version number, IP header length, IP packet length, time to live (TTL), protocol number, checksum, and source and destination addresses. The identification field is used to uniquely identify each IP packet sent by a host. It is typically used for fragmentation and re-assembly. Although we do not ask you to implement fragmentation, you should set the identification field according to the specification. A simple heuristic such as "incrementing by one each time a packet is sent" is sufficient for our purposes.

Destination Interface

1.1.2.1 1

1.1.1.2 1

Table 3: Forwarding table at Node 3
Section 2.4.2 of the text book has an example implementation of Internet checksum algorithm, which you may use in this project. Prototypes of the IP functions that you need to implement are provided in $PDIR/template/ipforward.h.

The simulator transport layer does not support ICMP. Hence, in case of an error, your forwarding layer will drop the packet and write an error message to stderr, instead of sending an ICMP packet. The format for the error message (to help grading assignments) should be something like
proto x.y.z.w:port -> a.b.c.d:port [optional 2-word reason]

In the first part of the project, the network layer forwarding table can be filled in statically. We provide a user space program (in $PDIR/utils), fdconfig, that can relay forwarding information to your kernel code via a "routing socket", as explained in the routing section of the simulator handout. Your code will be responsible for placing this information into your network layer’s forwarding tables. You should read a network configuration file like $PDIR/template/network.cfg and set the routes. In the second part of the project, you will implement the routing daemon and other infrastructure required to compute the forwarding table dynamically.

Table 3: Forwarding table at Node 3
Destination	Interface
`1.1.2.1`	`1`
`1.1.1.2`	`1`

OSPF Routing

In the second part of the project, you will implement a simple routing daemon. The routing daemon is intended to be implemented as a user-level application. Thus each node should have one routing daemon running.

The job of each routing daemon is to build its node’s forwarding table so that packets can be successfully forwarded to other nodes from that node. To accomplish this you will implement a shortest path link state routing protocol. In this protocol, each node in the network periodically exchanges information with its neighbors so that everyone in the network knows the best path to take to reach each destination.

Basic Operation

You will implement a link-state routing protocol similar to OSPF, which is described in the textbook in chapter 4, and in more detail in the OSPF RFC. Note, however, that your protocol is greatly simplified compared to the actual OSPF spec. As described in the references, OSPF works by having each router maintain an identical database describing the network’s topology. From this database, a routing table is calculated by constructing a shortest-path tree. Each routing update contains the node’s list of neighbors. Upon receiving a routing update, a node updates its routing table with the "best" routes to each destination. In addition, each routing daemon must remove entries from its routing table when they have not been updated for a long time. The routing daemon will have a loop that looks similar the following:
while (1) { /* each iteration of this loop is "cycle" */ wait_for_event(event); if (event == INCOMING_ADVERTISEMENT) { process_incoming_advertisements_from_neighbor(); } else if (event == IT_IS_TIME_TO_ADVERTISE_ROUTES) { advertise_all_routes_to_all_neighbors(); check_for_down_neighbors(); expire_old_routes(); } }
Let’s walk through each step. First, our routing daemon A waits for an event. If the event is an incoming link-state advertisement (LSA), it receives the advertisement and updates its routing table if the LSA is new or has a higher sequence number than the previous entries. If the routing advertisement is from a new router B or has a higher sequence number than the previously observed advertisement from router B, our router A will flood the new announcement to all of its neighbors except the one from which the announcement was received, and will then update its own routing tables.

If the event indicates that a predefined period of time has elapsed and it is time to advertise the routes, then the router advertises all of its links to its direct neighbors. If the routing daemon has not received any such advertisements from a particular neighbor for a number of advertisements, the routing daemon should consider that neighbor down.

If B receives an LSA announcement from A with a lower sequence number than it has previously seen (which can happen, for example, if A reboots), B should echo the prior LSA back to A. When A receives its own announcement back with a higher sequence number, it will increment its transmitted serial number to exceed that of the older LSAs.

Each routing announcement should contain a full state announcement from the router - all of its neighbors and all of its interfaces. This is an inefficient way to manage the announcements (see the extra credit section), but it greatly simplifies the design and implementation of the routing protocol to make it more tractable for a 4 week assignment. Each time your router originates a new LSA, it should increment the serial number it uses. When a router receives an updated LSA, it recomputes its local routing table. The LSAs received from each of the peer nodes tell the router a link in the complete router graph. When a router has received all of the LSAs for the network, it knows the complete graph. Generating the user routing table is simply a matter of running a shortest-paths algorithm over this graph.
Reliable Flooding

OSPF is based upon reliable flooding of link-state advertisements to ensure that every node has an identical copy of the routing state database. After the flooding process, every node should know the exact network topology. When a new LSA arrives at a router, it checks to see if the sequence number on the LSA is higher than it has seen before. If so, the router reliably transmits the message to each of its peers except the one from which the message arrived.

The flooding is made reliable by the use of acknowledgement packets from the neighbors. When router A floods an LSA to router B, router B responds with an "LSA Ack". If router A does not receive such an ack from its neighbor within a certain amount of time, router A will retransmit the LSA to B.

Protocol Specification

The following table shows the routing update message format, with the size of each field in bytes.

Link State Packet Format
Variable	Size (in bytes)	Description
version	1	the protocol version, always set to 1
ttl	1	the time to live of the LSA. It is decremented each hop during flooding, and is initially set to 32
type	2	announcement packets should be type 0, acknowledgements should be type 1
sender node id	4	the node id of the sender, so the receiver knows which neighbor it came from
sequence number	4	the sequence number for the LSA
num link entries	4	the number of nodes that are up and directly connected to the sender node
num interfaces	4	the number of interfaces for this host/router
link entries	4*num link entries	each link entry contains the node id of a node that is directly connected to the sender
interfaces	4*num interfaces	each interface entry contains the ip address of an interface for that host/router

All multi-byte integer fields (nodeIDs, TTLs, link entries, the type field, etc) should be in network byte order.

An acknowledgement packet looks very similar to an announcement packet, but it does not contain any link entries. It contains the sender nodeID and sequence number of the original announcement, so that the peer knows that the LSA has been reliably received.

Implementation

Your routing daemon must communicate routing information with all directly connected neighbors. OSPF runs directly on top of the IP layer (registered as IP protocol 89.) However, we'll be encapsulating our router-router communication into UDP packets. Please use UDP port 1099. One way to do this is to use a single UDP socket that can receive routing information from all neighbors. This means you will be using the Sendto() function, specifying the destination address and port, and the Recvfrom() function, which will tell you who sent the packet. However you may prefer to have a socket per interface. By default UDP is blocking, meaning it will not return until a packet is received. However, your routing daemon must send out periodic routing information, as well as listen for incoming routing information all in one thread. To achieve this you will use a non-blocking UDP socket, which will return immediately even if no packets have been received. To make the Recvfrom() call non-blocking use the MSG_NOBLOCK flag in each call. At this point, you should be able to implement a loop that can send periodic messages and receive messages from multiple neighbors.

There is one more important mechanism for the routing daemon you must implement: the UDP socket used to send routing information must only send to directly connected neighbors and must only send routing information to them if the direct link is up. For example, if a link between two nodes, A and B goes down, an alternate route (if it exists) should be chosen by your routing protocol from A to B going through a third node C. In this case, A should not send routing information to B since they are not directly connected anymore, however, using the IP forwarding table will not achieve this behavior since the routing information will end up going from A to B through C. To fix this problem you will have to set the UDP socket to send packets only to direct neighbors. This can be done by setting the SO_DONTROUTE flag using Setsockopt() on the UDP socket. Once this is done, every call to Sendto() on this UDP socket will force UDP to call ip_output() with the IP_NOROUTE flag. As mentioned in the simulator handout, when this flag is set, the ip_output routine, which you will implement, should use the kernel’s network interface list, which only has directly connected neighbors, to forward the packet instead of the forwarding table, which can reach all nodes in the network.

To update the entries in the node’s routing table, the routing daemon should use routing sockets, as described in the simulator handout. Note that you only have to implement the routing socket routines to add an entry, delete an entry, and change an entry. No information will be returned via the routing socket.

When running the routing daemon, the static method should not be used to populate the routing table. Instead, the routing daemon will assume that the routing table is initially empty, and it will use reliable flooding and the shortest path first routing protocol to update the routing table. On startup, it will read the same network configuration file that the kernel uses upon start-up to find out its immediate neighbors in the network topology. (We have supplied a library, libgetneighbors.a, and an associated header file, get_interface_neighbors.h, to do that.) As an artifact of the config file, you'll need to know your node ID, and since it's stripped by the simulator, you'll need it twice on the command line. Please use:
routed -n node-id node-id config-file
as your command-line.

Evaluation

Testing Utilities

How you test your code is up to you. However, we have provided some utilities to help you test. startkernel.pl is a perl script that will read the network configuration file and launch the appropriate kernels in separate X windows. fdconfig is a program that will allow you to manually manage routes in your kernel. unreliable-server is a program that runs on a simulated node, waiting for a transferred file, which it can write to a file or send to stdout. unreliable-client is a program that will read a file, and attempt to send it to the unreliable-server. pktdrop is a utility that can set a packet drop rate for a node's interface. It only affects the interface's sent packets, so if you want to simulate a cut wire, you'll need to call pktdrop for the interfaces at both ends of that wire. startkernel.pl can be found in the $PDIR/template directory and the rest of these programs can be found in All of these programs can be found in the $PDIR/utils directory.

We will use these programs to test your code. We will also test your code with various topologies. Edit the network config file to do the same.

Grading Rubric

IP Forwarding : 45 points
We will test your ip forwarding code in a variety of ways, using a series of test cases and by code inspection. Our tests may include (but may not be limited to): sending small and large files across routes, sending files across routes with multiple hops, simultaneous file transfers routed through a common router, invalid header elements, checksum issues, IP_NOROUTE support, buffer memory management, correct routing logic, interoperability...
OSPF Routing: 45 points
We will test your router code in a variety of ways, using a series of test cases and by code inspection. Our tests may include (but may not be limited to): routing convergence in small and large networks, link-down re-convergence, link-up re-convergence, proper reliable flooding logic, expiration of old route information, interoperability...
Usability: 10 points
Poor design, documentation, or code structure will probably reduce your grade by making it hard for you to produce a working program and hard for the grader to understand it; egregious failures in these areas will cause your grade to be lowered even if your implementation performs adequately. In addition to the documentation requirements listed in project1, since this is a group project, there should be some documentation about who did what in the README.
Late Policy: See syllabus

Hand-in

You should submit the following files:

Makefile, C source and header files
README
TEST - A description of what tests you have run, how you ran them, and why you think your solution works
Test code/scripts used to test your executables

Your Makefile should produce two executables, named p2_kernel and p2_routed, when make is run with no arguments in your directory. Your makefile should run gcc with option -Wall -Werror and should produce no errors or warnings.

Coding Style

We expect highly readable code. See project 1 for details.

Resources

Simulator handout
Project 2 Simulator Lecture: Important information on concurrency.
J. Postel, RFC 791 - Internet Protocol
L. L. Peterson and B.S. Davie, Computer Networks: A Systems Approach, 3rd edition, Sections 2.4.2, 4.1.4, and 4.2.3, 2003
J. Moy, RFC 2328 - OSPF Version 2
project 1 resources
Jennifer Rexford, Stable Internet Routing without Global Coordination, SCS Distinguished Lecture, February 23rd, 2006, WeH 7500