15-213 (Spring 2006) Section F - Recitation #3

TA: Jernej Barbic
Adapted from Kun Gao's recitation (Spring 2005)
Let's go Steelers!!!

Lab 2:

The bomb is a 64-bit binary unix executable. You have to use the fish machines for this lab.
No source code is given, with the exception of the main() routine.
Each person has a different bomb.
If you make a mistake and the bomb explodes, you lose 1/2 a point (up to max of 20 points).

Compiling an example bomb (just for the recitation; different from the Lab 2 problems)

I generated a sample bomb program to illustrate the main points of Lab 2. The program was compiled using:
gcc -O1 bomb.c -o bomb
Since the -g switch is not present, the binary contains no debugging information.

General note on compiling for debugging:

Normally, to enable the debugger to use the source code, you would compile a program using:
gcc -g bomb.c -o bomb (for lowest level of optimization), or
gcc -g -O2 bomb.c -o bomb (for level 2 of optimization)

The -g -O2 combination is valid and enables one to to debug the optimized executable. However the compiler will have generated a lot of optimizations, which will in my experience make it more difficult to step through the code. Using -g with no optimizations works best for debugging with source code. Debugging with source code is not the debugging style of Lab 2 - we will work with the assembly code directly.


Running the bomb

Bomb can be invoked by:
./bomb

The program waits for you to enter a string.
You can enter the input from the keyboard, or read it in from a file:
./bomb < solution.txt

The bomb then examines the string, and either explodes, or not.

Lab 2 problem statement

What input string should we give to the program, so that bomb doesn't explode, assuming we don't have access to program's source code?


objdump

'objdump' is a standard unix utilitity that can disassemble any unix executable:
objdump -d bomb > bomb.asm
Now, bomb.asm contains the assembly code obtained from the executable 'bomb'.

Assembly code for our example bomb:
...

00000000004005d8 <explode_bomb>:
  4005d8:       48 83 ec 08             sub    $0x8,%rsp
  4005dc:       bf 3c 07 40 00          mov    $0x40073c,%edi
  4005e1:       e8 22 ff ff ff          callq  400508 <puts@plt>
  4005e6:       bf 01 00 00 00          mov    $0x1,%edi
  4005eb:       e8 08 ff ff ff          callq  4004f8 <exit@plt>

...


00000000004005f0 <main>:
  4005f0:       53                      push   %rbx
  4005f1:       48 83 ec 50             sub    $0x50,%rsp
  4005f5:       48 89 e2                mov    %rsp,%rdx
  4005f8:       be 46 07 40 00          mov    $0x400746,%esi
  4005fd:       48 8b 3d 5c 04 10 00    mov    1049692(%rip),%rdi        # 500a60 <__bss_start>
  400604:       b8 00 00 00 00          mov    $0x0,%eax
  400609:       e8 ca fe ff ff          callq  4004d8 <fscanf@plt>
  40060e:       b9 00 00 00 00          mov    $0x0,%ecx
  400613:       ba 0a 00 00 00          mov    $0xa,%edx
  400618:       be 00 00 00 00          mov    $0x0,%esi
  40061d:       48 89 e7                mov    %rsp,%rdi
  400620:       e8 c3 fe ff ff          callq  4004e8 <__strtol_internal@plt>
  400625:       48 83 f8 01             cmp    $0xfffffffffffffffe,%rax
  400629:       7e 0a                   jl     400635 <main+0x45>
  40062b:       b8 00 00 00 00          mov    $0x0,%eax
  400630:       e8 a3 ff ff ff          callq  4005d8 <explode_bomb>
  400635:       bf 49 07 40 00          mov    $0x400749,%edi
  40063a:       e8 c9 fe ff ff          callq  400508 <puts@plt>
  40063f:       b8 00 00 00 00          mov    $0x0,%eax
  400644:       48 83 c4 50             add    $0x50,%rsp
  400648:       5b                      pop    %rbx
  400649:       c3                      retq

...
Also, we can extract the symbol table:
objdump -t bomb > bomb.tbl
The symbol table is sometimes useful to identify calls to standard library functions, such as printf, etc. Note that the symbol table is always present in the executable, even if the executable was compiled without the -g switch. Also note that using the symbol table is not directly necessary for Lab 2.

GDB (GNU DeBugger)


Now all we need to do is completely understand the assembly code, and we can defuse the bomb.
In Lab 2, we will be dealing with a lot of code, which can be difficult to understand. Even if we do a good job, we might make a mistake and accidentally detonate the bomb. This is where the gdb comes in. It lets us step through the assembly code as it runs, and examine the contents of registers and memory. We can also set breakpoints at arbitrary positions in the program. Breakpoints are points in the code where program execution is instructed to stop. This way, we can let the debugger run without interruption over large portions of code, such as code that we already understand or believe is error-free.

Starting gdb

Start gdb by specifying what executable to debug:
'gdb bomb'

We can run the bomb in the debugger just as we would outside the debugger, except that we can instruct the program to stop at certain locations and inspect current values of memory and registers. As a last resort, we can use (Ctrl-C) to stop the program and panic out. But this is not recommended and is usually not necessary, as long as we positioned our breakpoints appropriately.

To start a program inside gdb:
(gdb) run

To start a program inside gdb, with certain input parameters:
(gdb) run parameters

Examples:
(gdb) run < solution.txt
(equivalent to ./bomb < solution.txt , just this time inside gdb)

(gdb) run -d 1
(equivalent to ./bomb -d 1; this is a made-up example in the speficic case of the bomb program, as 'bomb' supports no such parameters; this example is meant to demonstrate how things would work in general)

Exiting gdb

To exit gdb and return to the shell prompt:
(gdb) quit
Note that exiting gdb means you lose all of your breakpoints that you set in this gdb session. When you re-run gdb, you need to respecify any breakpoints that you want to re-use. A common mistake is to forget this and then let the debugging proceed straight into the bomb_explode() routine.

Breakpoints

We wouldn't be using gdb if all we did was run the program without any interruptions. We need to stop program execution at certain key positions in the code, and then examine program behavior around those positions. How do we pick a good location for a breakpoint?

First, we can always set a breakpoint at 'main', since every C program has a function called 'main'.

In Lab 2, Dr. Evil accidently gave us 'bomb.c'. By examining this code, we see that we can place a good breakpoint at 'phase_1', as this is where our input is examined (examine bomb.c).

(gdb) break phase_1
Note: if you mistype the name of the routine, gdb will print a warning and not set any breakpoints.

Also note that program execution will always stop just BEFORE executing the instruction you set the breakpoint on.

Another essential breakpoint to set is on the explode_bomb routine:
(gdb) break explode_bomb

For inputs that don't solve the puzzle, this breakpoint will be your last safeguard before explosion. I recommend ALWAYS setting this breakpoint. In addition to that, I recommend setting another breakpoint inside explode_bomb, positioned after the call to routine that prints "BOOM!", but before the call to routine that notifies the server of the explosion. This can be useful if you accidentally enter explode_bomb, but don't notice that you hit the safeguard breakpoint. After several hours of debugging, when concentration drops down in a moment of weakness, it can happen that you accidentally instruct the program to keep on going. The second breakpoint will save you.

To set a breakpoint at the machine instruction located at the address 0x401A23:
(gdb) break *0x401A23
Note: don't forget the '0x'. If you forget it, and if you are unlucky enough that the address doesn't contain any A,B,C,D,E,F characters, breakpoint address will be interpreted as if given in the decimal notation. This results in a completely different address to what was desired, and breakpoint won't work as expected.

To see what breakpoints are currently set:
(gdb) info break

To delete one or more breakpoints:
(gdb) delete <breakpoint number>
Example:
(gdb) delete 4 7
erases breakpoints 4 and 7.

Terminating program execution from within gdb

We can terminate the program at any time:

(gdb) kill
Note that this doesn't exit gdb, and all your breakpoints remain active. You can re-run the program using the run command, and all breakpoints still apply.

Stepping through the code

To execute a single machine instruction, use:
(gdb) stepi
Note that if you use 'stepi' on a callq instruction, debugger will proceed inside the called function.
Also note that pressing <ENTER> re-executes the last gdb command. To execute several 'stepi' instructions one after another, type 'stepi' once, and then press <ENTER> several times in a row.

Sometimes we want to execute a single machine instruction, but if that instruction is a call to a function, we want the debugger to execute the function without our intervention. This is achieved using 'nexti':
(gdb) nexti
Program will be stopped as soon as control returns from the function, i.e. at the instruction immediately after callq in the caller function.

If you accidentally use stepi to enter a function call, and you really don't want to debug that function, you can use 'finish' to resume execution until the current function returns. Execution will stop at the machine instruction immediately after the 'callq' instruction in the caller function, just as if we had called 'nexti' in the first place:
(gdb) finish
Note: make sure the current function can really be run safely without your intervention. You don't want it to call explode_bomb.

To instruct the program to execute (without your intervention) until the next breakpoint is hit, use :
(gdb) continue
The same warning as in the case of 'finish' applies.

If program contains debugging information (-g switch to gcc; not the case in Lab 2, but otherwise usually the case ), we can also step a single C statement:
(gdb) step

Or, if next instruction is a function call, we can use 'next' to execute the function without our intervention. This is just like nexti, except that it operates with C code as opposed to machine instructions:
(gdb) next

Disassembling code using gdb

You can use 'disassemble' to disassemble a function or a specified address range.

To disassemble function explode_bomb:
(gdb) disassemble explode_bomb

To disassemble the address range from 0x4005dc to 0x4005eb:
(gdb) disassemble 0x4005dc 0x4005eb

Examining registers

To inspect the current values of registers:
(gdb) info registers
This prints out the current values of all registers.

To inspect the current values of a specific register:
(gdb) p $rax

To print the value in hex notation:
(gdb) p/x $rax

Note: using 'p $eax' to print just the lower 32 bits of the register doesn't work (at least with the current version of gdb on the fish machines). You have to print a full 64-bit register.

To see the address of the next machine instruction to be exectued:
(gdb) frame
or, equivalently, you can inspect the instruction pointer register:
(gdb) p/x $rip

Normally, when debugging a C/C++ program for which the source code is available (not the case with Lab 2), you can also inspect the call-stack (a list of all nested function calls that led to the current function being executed):
(gdb) where

Examining memory

To inspect the value of memory at location 0x400746:
(gdb) x/NFU 0x400746
Here:
N = number of units to display
F = output format (hex=h, signed decimal=d, unsigned decimal=u, string=s, char=c)
U = defines what constitutes a unit: b=1 byte, h=2 bytes, w=4 bytes, g=8 bytes
Note that output format and unit definition characters are mutually distinct from each other.

Examples:
To use hex notation, and print two consecutive 64-bit words, starting from the address 0x400746 and higher:
(gdb) x/2xg 0x400746
To print a null-terminated string at location 0x400746:
(gdb) x/s 0x400746
To use hex notation, and print five consecutive 32-bit words, starting from the address 0x400746:
(gdb) x/5xw 0x400746
To print a single 32-bit word, in decimal notation, at the address 0x400746:
(gdb) x/1dw 0x400746


The source code for the example bomb:

#include<stdio.h>
#include<stdlib.h>

void explode_bomb()
{
  printf("KABOOM!!!\n");
  exit(1);
}

int main()
{

  char s[80];
  fscanf(stdin,"%s",s);

  if(strtol(s,NULL,10) >= -2)
    explode_bomb();


  printf("Bomb was safely defused.\n");
  return 0;
}


Refer to the gdb notes online for a quick reference:
http://csapp.cs.cmu.edu/public/docs/gdbnotes.txt