15-213 (Spring 2006) Section F - Recitation #3
TA: Jernej Barbic
Adapted from Kun Gao's recitation (Spring 2005)
Let's go Steelers!!!
Lab 2:
The bomb is a 64-bit binary unix executable.
You have to use the fish machines for this lab.
No source code is given, with the exception of the main() routine.
Each person has a different bomb.
If you make a mistake and the bomb explodes, you lose 1/2 a point (up to max of 20 points).
Compiling an example bomb (just for the recitation; different from the Lab 2 problems)
I generated a sample bomb program to illustrate the main
points of Lab 2. The program was compiled using:
gcc -O1 bomb.c -o bomb
Since the -g switch is not present, the binary contains no debugging information.
General note on compiling for debugging:
Normally, to enable the debugger to use the source code,
you would compile a program using:
gcc -g bomb.c -o bomb (for lowest level of optimization), or
gcc -g -O2 bomb.c -o bomb (for level 2 of optimization)
The -g -O2 combination is valid and enables one to
to debug the optimized executable.
However the compiler will have generated
a lot of optimizations, which will in my experience make
it more difficult to step through the code. Using -g with
no optimizations works best for debugging with source code.
Debugging with source code is not the debugging style of Lab 2 -
we will work with the assembly code directly.
Running the bomb
Bomb can be invoked by:
./bomb
The program waits for you to enter a string.
You can enter the input from the keyboard, or read it in from a file:
./bomb < solution.txt
The bomb then examines the string, and either explodes, or not.
Lab 2 problem statement
What input string should we give
to the program, so that bomb doesn't explode, assuming we don't
have access to program's source code?
objdump
'objdump' is a standard unix utilitity that can disassemble any unix executable:
objdump -d bomb > bomb.asm
Now, bomb.asm contains the assembly code obtained from the executable 'bomb'.
Assembly code for our example bomb:
...
00000000004005d8 <explode_bomb>:
4005d8: 48 83 ec 08 sub $0x8,%rsp
4005dc: bf 3c 07 40 00 mov $0x40073c,%edi
4005e1: e8 22 ff ff ff callq 400508 <puts@plt>
4005e6: bf 01 00 00 00 mov $0x1,%edi
4005eb: e8 08 ff ff ff callq 4004f8 <exit@plt>
...
00000000004005f0 <main>:
4005f0: 53 push %rbx
4005f1: 48 83 ec 50 sub $0x50,%rsp
4005f5: 48 89 e2 mov %rsp,%rdx
4005f8: be 46 07 40 00 mov $0x400746,%esi
4005fd: 48 8b 3d 5c 04 10 00 mov 1049692(%rip),%rdi # 500a60 <__bss_start>
400604: b8 00 00 00 00 mov $0x0,%eax
400609: e8 ca fe ff ff callq 4004d8 <fscanf@plt>
40060e: b9 00 00 00 00 mov $0x0,%ecx
400613: ba 0a 00 00 00 mov $0xa,%edx
400618: be 00 00 00 00 mov $0x0,%esi
40061d: 48 89 e7 mov %rsp,%rdi
400620: e8 c3 fe ff ff callq 4004e8 <__strtol_internal@plt>
400625: 48 83 f8 01 cmp $0xfffffffffffffffe,%rax
400629: 7e 0a jl 400635 <main+0x45>
40062b: b8 00 00 00 00 mov $0x0,%eax
400630: e8 a3 ff ff ff callq 4005d8 <explode_bomb>
400635: bf 49 07 40 00 mov $0x400749,%edi
40063a: e8 c9 fe ff ff callq 400508 <puts@plt>
40063f: b8 00 00 00 00 mov $0x0,%eax
400644: 48 83 c4 50 add $0x50,%rsp
400648: 5b pop %rbx
400649: c3 retq
...
Also, we can extract the symbol table:
objdump -t bomb > bomb.tbl
The symbol table is sometimes
useful to identify calls to standard library functions,
such as printf, etc.
Note that the symbol table is always present in the executable,
even if the executable was compiled without the -g switch.
Also note that using the symbol table is not directly necessary for Lab 2.
GDB (GNU DeBugger)
Now all we need to do is completely understand the assembly code,
and we can defuse the bomb.
In Lab 2, we will be dealing with a lot of code,
which can be difficult to understand.
Even if we do a good job, we might make a mistake and accidentally
detonate the bomb.
This is where the gdb comes in.
It lets us step through the assembly code as it runs,
and examine the contents of registers and memory.
We can also set breakpoints at arbitrary positions in the program.
Breakpoints are points in the code where program execution
is instructed to stop. This way, we can let the debugger
run without interruption over large portions of code,
such as code that we already understand or believe is error-free.
Starting gdb
Start gdb by specifying what executable to debug:
'gdb bomb'
We can run the bomb in the debugger just as we would outside the debugger,
except that we can instruct the program to stop at certain locations
and inspect current values of memory and registers.
As a last resort, we can use (Ctrl-C) to stop the program
and panic out. But this is not recommended and is usually not
necessary, as long as we positioned our breakpoints appropriately.
To start a program inside gdb:
(gdb) run
To start a program inside gdb, with certain input parameters:
(gdb) run parameters
Examples:
(gdb) run < solution.txt
(equivalent to ./bomb < solution.txt , just this time inside gdb)
(gdb) run -d 1
(equivalent to ./bomb -d 1; this is a made-up example in
the speficic case of the bomb program,
as 'bomb' supports no such parameters; this example is meant
to demonstrate how things would work in general)
Exiting gdb
To exit gdb and return to the shell prompt:
(gdb) quit
Note that exiting gdb means you lose all of your breakpoints that
you set in this gdb session. When you re-run gdb, you need
to respecify any breakpoints that you want to re-use.
A common mistake is to forget this and
then let the debugging proceed straight into the bomb_explode()
routine.
Breakpoints
We wouldn't be using gdb if all we did was run the program
without any interruptions.
We need to stop program execution at certain key positions in the code,
and then examine program behavior around those positions.
How do we pick a good location for a breakpoint?
First, we can always set a breakpoint
at 'main', since every C program has a function called 'main'.
In Lab 2, Dr. Evil accidently gave us 'bomb.c'. By examining this
code, we see that we can place a good breakpoint at 'phase_1',
as this is where our input is examined (examine bomb.c).
(gdb) break phase_1
Note: if you mistype the name of the routine, gdb will print a warning and
not set any breakpoints.
Also note that program execution will always stop just BEFORE executing
the instruction you set the breakpoint on.
Another essential breakpoint to set is on the explode_bomb routine:
(gdb) break explode_bomb
For inputs that don't solve the puzzle, this breakpoint will be
your last safeguard before explosion. I recommend ALWAYS setting this
breakpoint. In addition to that, I recommend setting another
breakpoint inside explode_bomb, positioned after the call
to routine that prints "BOOM!", but before the call to routine
that notifies the server of the explosion.
This can be useful if you accidentally enter
explode_bomb, but don't notice that you hit the safeguard breakpoint.
After several hours of debugging, when concentration drops down
in a moment of weakness, it can happen that you
accidentally instruct the program to keep on going.
The second breakpoint will save you.
To set a breakpoint at the machine instruction located at the address 0x401A23:
(gdb) break *0x401A23
Note: don't forget the '0x'. If you forget it, and if you are unlucky
enough that the address doesn't contain any A,B,C,D,E,F characters,
breakpoint address will be interpreted as if given in the
decimal notation. This results in a completely different address
to what was desired, and breakpoint won't work as expected.
To see what breakpoints are currently set:
(gdb) info break
To delete one or more breakpoints:
(gdb) delete <breakpoint number>
Example:
(gdb) delete 4 7
erases breakpoints 4 and 7.
Terminating program execution from within gdb
We can terminate the program at any time:
(gdb) kill
Note that this doesn't exit gdb, and all your breakpoints
remain active. You can re-run the program using the run
command, and all breakpoints still apply.
Stepping through the code
To execute a single machine instruction, use:
(gdb) stepi
Note that if you use 'stepi' on a callq instruction, debugger
will proceed inside the called function.
Also note that pressing <ENTER> re-executes the last
gdb command. To execute several 'stepi' instructions
one after another, type 'stepi' once, and then press <ENTER>
several times in a row.
Sometimes we want to execute a single machine instruction,
but if that instruction is a call to a function, we want
the debugger to execute the function without our intervention.
This is achieved using 'nexti':
(gdb) nexti
Program will be stopped as soon as control returns from the function,
i.e. at the instruction
immediately after callq in the caller function.
If you accidentally use stepi to enter a function call, and you
really don't want to debug that function, you can use 'finish'
to resume execution until the current function returns.
Execution will stop at the machine instruction immediately
after the 'callq' instruction in the caller function, just as
if we had called 'nexti' in the first place:
(gdb) finish
Note: make sure the current function can really be run
safely without your intervention. You don't want it
to call explode_bomb.
To instruct the program to execute (without your intervention)
until the next breakpoint is hit, use :
(gdb) continue
The same warning as in the case of 'finish' applies.
If program contains debugging information (-g switch to gcc;
not the case in Lab 2, but otherwise usually the case ),
we can also step a single C statement:
(gdb) step
Or, if next instruction is a function call, we can use 'next' to
execute the function without our intervention. This is just like
nexti, except that it operates with C code as opposed to machine
instructions:
(gdb) next
Disassembling code using gdb
You can use 'disassemble' to disassemble a function or
a specified address range.
To disassemble function explode_bomb:
(gdb) disassemble explode_bomb
To disassemble the address range from 0x4005dc to 0x4005eb:
(gdb) disassemble 0x4005dc 0x4005eb
Examining registers
To inspect the current values of registers:
(gdb) info registers
This prints out the current values of all registers.
To inspect the current values of a specific register:
(gdb) p $rax
To print the value in hex notation:
(gdb) p/x $rax
Note: using 'p $eax' to print just the lower 32 bits
of the register doesn't work (at least with
the current version of gdb on the fish machines).
You have to print a full 64-bit register.
To see the address of the next machine instruction to
be exectued:
(gdb) frame
or, equivalently, you can inspect the instruction pointer register:
(gdb) p/x $rip
Normally, when debugging a C/C++ program for which the source code
is available (not the case with Lab 2), you can also inspect
the call-stack (a list of all nested function calls that led to
the current function being executed):
(gdb) where
Examining memory
To inspect the value of memory at location 0x400746:
(gdb) x/NFU 0x400746
Here:
N = number of units to display
F = output format (hex=h, signed decimal=d, unsigned decimal=u, string=s, char=c)
U = defines what constitutes a unit: b=1 byte, h=2 bytes, w=4 bytes, g=8 bytes
Note that output format and unit definition characters are mutually distinct from each other.
Examples:
To use hex notation, and print two consecutive 64-bit
words, starting from the address 0x400746 and higher:
(gdb) x/2xg 0x400746
To print a null-terminated string at location 0x400746:
(gdb) x/s 0x400746
To use hex notation, and print five consecutive 32-bit
words, starting from the address 0x400746:
(gdb) x/5xw 0x400746
To print a single 32-bit word, in decimal notation,
at the address 0x400746:
(gdb) x/1dw 0x400746
The source code for the example bomb:
#include<stdio.h>
#include<stdlib.h>
void explode_bomb()
{
printf("KABOOM!!!\n");
exit(1);
}
int main()
{
char s[80];
fscanf(stdin,"%s",s);
if(strtol(s,NULL,10) >= -2)
explode_bomb();
printf("Bomb was safely defused.\n");
return 0;
}
Refer to the gdb notes online for a quick reference:
http://csapp.cs.cmu.edu/public/docs/gdbnotes.txt