The second device driver you will write is the keyboard device driver. Whenever the user presses a key, the keyboard device driver needs to find out what key it is, whether it was pushed or released, and make that information available to the operating system. An important aspect of the keyboard device driver is that it is interrupt driven. We will take a closer look at what this means later.
The third "device driver" is the timer handler. While we do not often think of the on board PC timer as a device, it generates interrupts just like the keyboard. The timer is a critical device that is used in context switching between processes.
It is not possible to test these device drivers on their own. In order to develop and test them, you will be provided with a skeleton kernel. The kernel performs some very basic machine initialization such as transitioning to protected mode (see appendix for an explanation of protected mode). Once this initialization is complete, it transfers control to the main() function, which is where you can put your initialization and test routines. There is no virtual memory and no file system. Both of these are things that you will create in later projects.
Once you have implemented the three device drivers, you will use the three of them to implement a simple clock. The clock needs to be adjustable by entering three numbers and pressing return. A more detailed explanation of the desired behavior is presented later in this handout.
Goals
Despite the fact that this is the smallest project of the four that will be assigned in this class, it is important to pay attention to the key concepts in project one. The ideas taught here will provide the foundation for the next three projects. In particular, we would like you to be comfortable with:
The goal of this project set is certainly not to teach the idiosyncrasies of the x86 architecture (or Intel's documentation). That said, it will be necessary to become accustomed to the x86 way of doing things, and the Intel nomenclature, for the purposes of completing this project set. Just keep in mind that the x86 way of doing things is not the only way of doing things. It is the price to be paid for learning the principles of operating systems on a real world system instead of a simulated architecture.
Important Dates
Monday, January 20th: Project one assigned.
Monday, January 27nd: You should be familiar with using Simics, have console output working, and have started on the keyboard and timer interrupts.
Monday, February 3rd: Project one is due at midnight (Monday night).
Welcome to x86 Kernel Programming
These projects will be a new experience for many of you because they are special in two ways. The first is that, as kernel programmers, you are subject to a number of constraints that do not show up in user-level programming. The second is that you will need to look down at the hardware you are running on and manipulate device registers/processor data structures to get your kernel to work. Because this is new for many of you, the following section is devoted to saying a few things about kernel programming, and then summarizing some of the important features of the x86 architecture which you will need to know.
Kernel Programming: Always Assuming the Worst
Programming in the kernel is quite different from writing user level applications. As has been emphasized in class, safety is a priority in kernel code - that is, checking pointers passed to you by user code before you dereference them, testing all possible cases to ensure there are no infinite loops, etc. Kernels need to be as absolutely bulletproof as possible. For example, the following line will happily execute as part of your kernel:
*(char*)0x0 = 0x1badb00b;
Wild pointer accesses are much more dangerous in kernel mode because valuable kernel data structures are not protected by the user mode/kernel mode protection schemes of the processor (discussed here shortly).
There is also a lot of functionality missing in a bare kernel - there is no virtual memory and no filesystem, and none of the devices are initialized. We have taken care of some of these things for you. A group at the University of Utah has authored a set of operating systems components called OSkit. We have used some of these components to take care of bootstrapping and also to provide many of the standard C library functions you know and love.
The entrypoint into the kernel is the main() function in kernel.c. When this function returns, the OSkit support code will wait for a keypress and then reboot the machine. The top level of your kernel, then, is the main() function in kernel.c.
Debugging operating system code is typically quite difficult. One of way of doing it is to have two machines, one running a standard production OS like Linux, and the other running (and presumably crashing) the OS being developed. A serial cable is used to attach the two machines and special serial handlers installed on the development OS allow a debugger on the production OS to debug the development code. Fortunately, you will not have to deal with this. For these projects you will be using an instruction set simulator called Simics. More information on Simics is included at the end of this handout, including how to run it and what you need if you wish to install it on your own computer.
Privilege Levels
The x86 architecture uses privilege levels to make defending the kernel against wild user processes easier. There are four privilege levels ranging from zero to three, with zero being the most privileged and three being the least. At any given time, the processor is "executing" in one of these four privilege levels. As you might have already guessed, your kernel code will execute at privilege level zero, while user code should execute at privilege level three. We will not be using privilege levels one or two in these projects. We will refer to privilege levels zero and three as PL0 and PL3 respectively. Because of the way privilege levels are explained in section 4.5 of intel-sys.pdf, privilege levels are often called "rings" (as in ring zero or ring three), though we will try to avoid that notation in these projects.
The processor checks the privilege level (PL) in a number of circumstances, such as when an attempt is made to execute a privileged instruction. These are instructions that modify control registers or change how the processor is running (a list of these instructions can be found in section 4.9 of intel-sys.pdf).
Segmentation
While we will try to avoid using segmentation as much as possible for the duration of these projects, the x86 architecture requires at least minimal use of segmentation. In order to install interrupts and manage processes you will need to understand what a segment is. Unlike virtual memory however, segments are straightforward and will not do much to complicate your life.
A segment is simply a portion of the address space. Once defined, we can associate important characteristics with the segment, like the priority level required to access the segment or whether it contains code or data. A segment does not need to represent memory that actually exists - for example, we could define a segment ranging from 0 to 4GB (the entire 32 bit address space), even though we might only have 128MB of physical memory.
Because your kernels are going to be complicated enough as it is, we will steer clear of segmentation by using only four segments. These four segments all span the entire 32 bit address space (overlapping each other completely). Two of them require privilege level 0 to be accessed, and two of them only require privilege level 3. Of each privilege level pair, one segment represents code and the other data. We have set up these segments for you. In this project, we have no user processes, so the only segments that will be used are those requiring PL 0 to be accessed. To refer to these segments when configuring kernel data structures (like IDT entries), use the #defines in interrupts.h. These numbers are segment selectors, which are simply indices into another processor data structure, the global descriptor table (GDT). The GDT is the place we have defined our segments. You do not have to reconfigure it or know its format, however.
Communicating with Devices
There are two ways to communicate with a device on the x86 architecture. The first is to send bytes to an I/O port. The second is through memory-mapped I/O.
Most devices in the x86 architecture are accessed through I/O ports. These ports are controlled by special system hardware that has access to the data, address, and control lines on the processor. By using special hardware instructions to read and write from these ports, we can use I/O ports without infringing upon the normal address space of the kernel or user applications. This is because these special instructions tell the hardware that this memory reference is actually an I/O port, not a traditional memory location. For more information on I/O ports, consult chapter 10 of intel-arch.pdf. Both the timer and the keyboard use I/O ports. For convenience, an assortment of C wrapper functions is provided to you to save you from having to break into assembly language to read or write I/O ports. These are located in io_map.h.
There are also some devices which are accessed by reading and writing special addresses in traditional memory (memory-mapped I/O). This part of memory is part of the regular address space and therefore needs to be carefully managed. The console uses memory-mapped I/O (along with some I/O ports for things like cursor position).
Hardware Interrupts
A kernel programmer often uses either I/O ports or memory-mapped I/O to send commands to devices (ranging in size from single byte commands to many words). How do hardware devices communicate with kernel? When a packet arrives at a network interface, the user presses a key on the keyboard or moves the mouse, or any other type of event occurs at a hardware device, that device needs a way of getting the attention of the kernel. One way is for the kernel to keep asking the device if it has something new to report (either periodically or continuously). This is called polled I/O. This is wasteful of CPU time however and there is a better mechanism. That mechanism is the hardware interrupt.
When a device wants to raise a hardware interrupt, it communicates this desire to one of two programmable interrupt controllers (PICs) by asserting some control signals on interrupt request lines. The PICs are responsible for serializing the interrupts (taking possibly concurrent interrupts and ordering them), and then communicating the interrupts to the processor through special control lines. The PICs tell the processor that a hardware interrupt has occurred, as well as which request line the interrupt occurred on so the processor knows how to handle the interrupt. There are some PC conventions as far as what devices are associated with what interrupt request lines.
|
|
Once the processor receives an interrupt from the PIC, it needs to know how to respond. The processor reads a data structure called the interrupt descriptor table (IDT). There is a descriptor in this table for each interrupt. A descriptor contains information about how to resolve the interrupt, most importantly where the interrupt handler is located. The interrupt handler is a piece of code that the author of the device driver writes that gets executed when that device issues an interrupt. Once the processor locates the appropriate entry in the IDT, it saves some information about what it was doing before the interrupt occurred, then starts executing at the address of the interrupt handler. Once the interrupt handler has run to completion, the processor uses the saved information to resume its previous task. Note that IRQ numbers do not necessarily correspond to matching indices into the IDT. The PICs have the ability to map their IRQ lines to any entry in the IDT. You will be given information on which IDT entries correspond to interrupts pertaining to your projects.
Interrupt handlers typically need to execute as quickly as possible so that the processor is free to accept future interrupts. When an interrupt is sent by the PIC, the PIC will not send another interrupt from that same source until it gets acnowledged through an I/O port. This is because interrupt handlers usually manipulate critical data structures and would not withstand being interrupted by new invocations of themselves (they are not reentrant). In particular, an interrupt handler must never block on anything. Most interrupt handlers simply make a note of work that must be done as a result of the interrupt, clear the interrupt, then terminate, leaving the work to be done at a more convenient time. Note that it may be possible for one interrupt handler to be interrupted by a different interrupt handler, so long as they do not share data structures.
Software Interrupts and Exceptions
Hardware interrupts are not the only type of interrupt. Programs can issue software interrupts as well. These interrupts are often used as a way to transfer execution to the kernel in a controlled manner, for example during a system call. To perform a software interrupt a user application will execute a special instruction (int n) which will cause the processor to execute the nth handler in the IDT.
In addition to hardware and software interrupt handlers, the IDT also contains information about exception handlers. Exceptions are conditions in the processor that are usually unintended and need to be addressed. Page faults, divide-by-zero, and segmentation faults are all types of exceptions. We will cover both software interrupts and exceptions in more detail later, especially in project three. For now, we just need to look at how to install an interrupt handler.
Configuring Interrupts
As mentioned previously, an x86 processor uses the interrupt descriptor table (IDT) to find the address of the proper interrupt handler when an interrupt occurs. To install your keyboard (and timer) interrupt handler, you will have to install an entry in this table.
An entry in the IDT can be one of three different types: a task gate, an interrupt gate, or a trap gate (a "gate" is just a descriptor that contains information about a transition to a different piece of code - think "gate to new procedure"). We will not be using task gates because they make use of the processor's hardware task switching functionality (which we also will not use for reasons that will become clear later in the course). The difference between an interrupt gate and a trap gate is that when the processor accesses an interrupt handler through an interrupt gate, it disables all further interrupts by clearing a flag in one of the processor's registers. Handling interrupts through trap gates does not do this. For the interrupts in this project, we will be using trap gates, because the timer and keyboard interrupts do not interfere with each other and therefore we need not disable interrupts (aside from interrupts from the same source, which is done automatically) when executing their handlers.
The format of the trap gate is given on page 151 of intel-sys.pdf. Note that regardless of the type of the gate, the descriptor is 64 bits long. You can use this fact to index into the IDT without knowing the type of the other gates in the table. To get the base address of the IDT, we use an instruction called sidt. We have provided a C wrapper function for this instruction so you do not have to break into assembly. This is supplied in interrupts.h.
Note that there is also a function in interrupts.h called remap_intr(). You must call this after installing your interrupt handlers. Initially, the PICs are programmed so that they call entries in the IDT that conflict with processor exceptions. This function remaps the PICs to higher IDT entries.
Some of the fields in the trap gate are self-explanatory, however others are not so they are summarized here:
DPL: The privilege level required to execute the handler. If it is set to 3, then user processes can execute the handler directly. For hardware interrupts (like the keyboard and timer), it should be set to 0.
Offset: The offset into the segment of the interrupt handler. Since all of our segments take up the whole address space, this is simply the address of your function handler. This can be obtained in C by typing the name of the function (without any parenthesis or arguments).
Present: Must be set to 1 for a working interrupt handler.
Segment Selector: The segment selector for the target code segment. The interrupt handler is going to be executed at PL0 and it is code, so this needs to be set to KERNEL_CS (defined in interrupts.h).
Writing an Interrupt Handler
As mentioned above, when the processor receives an interrupt it uses the IDT to start executing the interrupt handler. Before the interrupt handler executes, however, the processor pushes some information onto the stack so that it can resume its previous task when the handler has completed. The exact contents and order of this information is presented on page 153 of intel-sys.pdf. In project one there are no user tasks, so the processor will be executing in PL0 all the time. This means interrupts will not cause a privilege level change, so the EFLAGS (a system register containing an assortment of important flags), current code segment, and the instruction pointer will be pushed on the stack. There is no error code - this is for exception handlers.
This information is indeed enough to resume executing whatever code was running when the interrupt first arrived. However, in order to service the interrupt we need to execute some code; this code will clobber the values in the general purpose registers. When we resume executing normal code, that code will expect to see the same values in the register, as if no interrupt had ever occured. So the first thing an interrupt handler must do is save all the general purpose registers, plus %ebp. We need not save the stack pointer because if there is no stack change (as is the case in this project since we have no user processes), the stack pointer will be correct once we pop off all of our interrupt-related information. If there is a stack change (as there will be in project three when you do have user processes), the interrupt saves the stack pointer for us. So, to recap, the registers you need to save are %eax, %ebx, %ecx, %edx, %esi, %edi, and %ebp.
The easiest way to save these registers is to just save them on the stack. This is easily done with the pusha and popa instructions (more information on these can be found on pages 624 and 576 of intel-isr.pdf). To be sure that you save the registers before anything else occurs, you should at least write a wrapper for your interrupt handler in assembly. All you must do in assembly is save the registers, call your C handler (try the call instruction), then restore the registers. To return from an interrupt, use the iret instruction. It uses the information initially saved on the stack by the interrupt to return to the code executing when the interrupt occured.
A final note about writing assembly. Comment your assembly profusely. One comment per instruction is not a bad rule of thumb. Keep your assembly in separate files, and to export a symbol (like asm_timer_wrapper) so you can refer to it elsewhere, use the directive .globl, like this:
.globl asm_timer_wrapper
Console Device Driver Specification
The job of the console device driver seems simple: take characters and print them on the screen. For the most part, it is that simple. Keep in mind that this console will be the front end of your kernel for the rest of the semester, however. For this reason you will probably want it to do a few basic things like scroll when the output reaches the bottom of the console, line wrap when a line exceeds the width of the console, etc. The first thing to do is to understand how to print a single character to the console.
Writing a Character to the Console
The contents of the console are controlled by a region of main memory (memory mapped I/O). Each character on the console is represented in this region by a byte pair. The first byte in this pair is simply the character itself. The second byte controls the foreground and background colors used to draw the character. These byte pairs are stored in row major order. For your console, there should be 25 rows of 80 characters each. The location of video memory, as well as the color codes used for the second byte of each pair, are defined in video_defines.h.
Writing a character to the console is as simple as writing a byte pair to video memory. To write the character 'M' on the 3rd character of the 2nd line, you would do something like this:
*(CONSOLE_MEM_BASE + 2*(CONSOLE_WIDTH + 2)) = 'M'; *(CONSOLE_MEM_BASE + 2*(CONSOLE_WIDTH + 2) + 1) = console_color; where CONSOLE_MEM_BASE is the defined location of the console memory, CONSOLE_WIDTH is the width of the console in characters (defined to be 80), and console_color is a variable containing the current foreground and background color specification of your choice.
When the bare-bones kernel initializes the hardware, it will do the dirty work of setting up the console in 80x25 format. However, you will also need to reposition or hide the cursor. The cursor is controlled by the Cathode Ray Tube Controller (CRTC), an important device on the video card. Communication with the CRTC is accomplished with a special pair of registers. Accessing these registers is a little different from writing to console memory as it uses the I/O ports (discussed earlier).
Communicating with the CRTC
The CRTC is one of the devices which is accessed through I/O ports. It has two registers, an index register and a data register. The index register tells the CRTC what function you are performing on it, such as setting the cursor position. The data register then accepts a data value associated with the operation. Because the data register is only one byte long, and the offset of the cursor (which is measured in single bytes, not two byte pairs) is a two byte quantity, setting the cursor position is done in two steps. The commands to send to the CRTC, as well as the location of the CRTC I/O ports, are defined in video_defines.h. To hide the cursor completely, simply set the cursor to an offset greater than the size of the console.
Console Device Driver Interface
Now that you know how to write characters to the console and move/hide the cursor, you have the primitives you need to build a console device driver. Here is the API that you need to implement for the driver.
void putbytes(const char* s, int len);
Description: Prints the string s, starting at the current location of the cursor. If the string is longer than the current line, the string should fill up the current line and then continue on the next line. If the string exceeds available space on the entire console, the screen should scroll up one line, and then the string should continue on the new line. If the newline character '\n' is encountered in the string, the portion of the string following the newline should continue on the next line (scrolling if necessary). If the carriage return character '\r' is encountered, the cursor should be immediately reset to the beginning of the current line, causing the portion of the string following the carriage return to overwrite any existing output on the line. If len is not a positive integer or s is null, the function has no effect.
const char* s: The string to be printed.
int len: The length of the string s.
Returns: Void.
void set_term_color(int color);
Description: Changes the foreground and background color of future characters printed on the console. If the color code is invalid, the function has no effect.
int color: The new color code.
Returns: Void.
void get_term_color(int* color);
Description: Writes the current foreground and background color of characters printed on the console into the argument color.
int* color: The address to which the current color information will be written.
Returns: Void.
void set_cursor(int row, int col);
Description: Sets the position of the cursor to the position (row, col). Subsequent calls to putbytes should cause the console output to begin at the new position. If the cursor is currently hidden, a call to set_cursor() must not show the cursor.
int row: The new row for the cursor.
int col: The new column for the cursor.
Returns: Void.
void get_cursor(int* row, int* col);
Description: Writes the current position of the cursor into the arguments row and col.
int* row: The address to which the current cursor row will be written.
int* col: The address to which the current cursor column will be written.
Returns: Void.
void hide_cursor();
Description: Hides the cursor. Subsequent calls to putbytes must not cause the cursor to show again.
Returns: Void.
void show_cursor();
Description: Shows the cursor. If the cursor is already shown, the function has no effect.
Returns: Void.
void clear_console();
Description: Clears the entire console.
Returns: Void.
Timer Device Driver Specification
Unlike the console, the timer generates interrupts that must be handled properly. If you do not handle timer interrupts quickly enough, the timer will generate the next timer interrupt before the PIC has been reset by your timer interrupt handler, and a timer interrupt will be lost.
The actual timer interrupt handler in this project is quite simple. Although you will be using the timer interrupt to trigger your scheduler in project 3, here all it needs to do is increment a tick counter. This tick counter will allow you to create your clock.
Configuring the Timer
Communicating with the timer is done through I/O ports. These I/O ports are defined in timer.h. Also defined in timer.h is the internal rate of the PC timer, 1193182 Hz. Fortunately, we can configure the timer give us interrupts at a fraction of that rate. For convenience, you should configure the timer to generate interrupts every 10 milliseconds.
To initialize the timer, first set its mode by sending TIMER_SQUARE_WAVE (defined in timer_defines.h) to TIMER_MODE_IO_PORT (also defined in timer_defines.h). The timer will then expect you to send it the number of timer cycles between interrupts. This rate is a two byte quantity, so first send it the least significant byte, then the most significant byte. These bytes should be sent to TIMER_PERIOD_IO_PORT (defined in timer_defines.h).
When the timer interrupt occurs the processor consults the IDT to find out where the timer handler is. The index into the IDT for the timer is TIMER_IDT_ENTRY, defined in timer_defines.h. You will need to complete this entry for your timer handler to execute properly.
Aside from incrementing your tick counter, your timer interrupt handler should save and restore the general purpose registers. You also need to tell the PIC that you have processed the most recent interrupt that the PIC delivered. This is done by sending an INT_CTL_DONE to one of the PIC's I/O ports, INT_CTL_REG. These are defined in interrupts.h.
Note: You will be testing this on an instruction set simulator. Even though you are simulating an older processor on relatively fast machine, Simics does not make an effort to exactly correlate the simulation to real wall clock time. Your timer may appear a little wobbly in the simulator, but it should be roughly accurate. If you have it set up properly, the timer will be exactly correct on real hardware.
Timer Device Driver Interface
The timer has no function API. Instead, there is just one global integer that should be incremented every ten milliseconds.
int timer_ticks;
An integer value that is initialized to 0 at boot time, and is incremented once every ten milliseconds.
Keyboard Device Driver Specification
Like the timer, the keyboard is also interrupt driven. However we are not only interested in the fact that a keyboard interrupt happened; each keyboard interrupt also has data that comes along with it. The information retrieved from the keyboard also needs to be processed in order to turn it into a stream of intelligible characters suitable for delivery to application processes. At a high level, the keyboard device driver provides a buffer that contains characters returned by the readchar() function.
Interacting with the Keyboard
One would think that reading keys from the keyboard would be as simple as receiving a simple character stream. Unfortunately it is not that easy. For one thing, both key presses and key releases are reported by the keyboard via interrupts. The other complicating factor is that the data reported by the keyboard is in a special format called scan codes. These codes need to be converted into normal ASCII characters. For your convenience, we provide a function that converts a scan code into an ASCII character. This function (called process_scancode) is described in keyhelp.h.
When you receive a keyboard interrupt, your keyboard handler needs to read the scancode from the keyboard by reading a byte from the keyboard's I/O port (this is defined in keyhelp.h). You also need to tell the PIC that you have processed the keyboard interrupt. This is done by sending a INT_CTL_DONE to one of the PIC's I/O ports, INT_CTL_REG. In the interest of making the interrupt handler as short as possible, you should postpone processing the scan code until you are no longer inside an interrupt handler.
Do not forget that your keyboard interrupt handler needs to save and restore the general purpose registers! Also, we have included functions that enable and disable interrupts in interrupts.h. You will most likely need to use them once in your keyboard driver to ensure there are no interrupt-related concurrency problems.
Keyboard Device Driver Interface
The keyboard device driver has a very simple interface - it is just one function, readchar().
char readchar();
Description: Returns the next character in the keyboard buffer. This function does not block if there are no characters in the keyboard buffer.
Returns: The next character in the keyboard buffer, or -1 if the keyboard buffer is currently empty.
Tying it All Together
Once you have implemented and tested the console, timer, and keyboard, you can use the three of them to turn your $1000 (more or less) computer into a simple clock. We would like you to write some code in your kernel so that it exhibits the following behavior:
Because Simics simulates every instruction executed, it has some powerful debugging capabilities. It allows, for example, breakpoints to be set on memory accesses to certain locations (specifically reads, writes, executes, or any combination thereof). It also allows temporal breakpoints - breakpoints that occur after a certain number of instructions have been executed. Like the popular debugger GDB (which most of you should be familiar with from 15-213) it has some symbolic debugger capabilities as well.
Simics has been installed on AFS and thus may be used from any Linux or Solaris machine connected to AFS, subject to licensing restrictions. The proj1 tarball includes scripts to run both versions. Note however that even if you choose to work on Solaris, your code must be compiled on Linux.
Instructions on getting Simics to work within your development environment are on the Projects section of the course website.