15-213 (Fall 2006) - Recitation #2
Tudor Dumitraş
Adapted from Jernej Barbic's recitation (Spring 2006)
Adapted from Kun Gao's recitation (Spring 2005)
Let's go Steelers!!!
Lab 2:
The bomb is a 64-bit binary unix executable.
You have to use the fish machines for this lab.
No source code is given, with the exception of the main() routine.
Each person has a different bomb.
If you make a mistake and the bomb explodes, you lose 1/2 a point (up to max of 20 points).
Compiling an example bomb (just for the recitation; different from the Lab 2 problems)
I generated a sample bomb program to illustrate the main
points of Lab 2. The program was compiled using:
gcc -O1 bomb.c -o bomb
Since the -g switch is not present, the binary contains no debugging information.
General note on compiling for debugging:
Normally, to enable the debugger to use the source code,
you would compile a program using:
gcc -g bomb.c -o bomb (for lowest level of optimization), or
gcc -g -O2 bomb.c -o bomb (for level 2 of optimization)
The -g -O2 combination is valid and enables one to
to debug the optimized executable.
However the compiler will have generated
a lot of optimizations, which will in my experience make
it more difficult to step through the code. Using -g with
no optimizations works best for debugging with source code.
Debugging with source code is not the debugging style of Lab 2 -
we will work with the assembly code directly.
Examining the bomb
The symbol table is sometimes
useful to identify calls to standard library functions,
(e.g., printf), as well as the bomb's own functions.
Note that the symbol table is always present in the executable,
even if the executable was compiled without the -g switch.
You can look at all the bomb's symbol table by using nm:
nm bomb
Examine the symbols marked with a T (capital t), and
ignore the ones that start with an _ (underscore). These are
names of functions from the C program that was used to compile
the bomb.
Notice that there is a function called explode_bomb; can you guess
what this function does?
Next, take a look at the printable strings from the file:
strings bomb
This way, you may find clues that will help you defuse some of the
phases of your bomb.
Then, use objdump to disassemble the bomb:
objdump -d bomb | less
Assembly code for our example bomb:
...
0000000000400588 :
400588: 48 83 ec 08 sub $0x8,%rsp
40058c: bf 2c 07 40 00 mov $0x40072c,%edi
400591: e8 22 ff ff ff callq 4004b8
400596: bf 01 00 00 00 mov $0x1,%edi
40059b: e8 08 ff ff ff callq 4004a8
00000000004005a0 :
4005a0: 53 push %rbx
4005a1: 48 83 ec 10 sub $0x10,%rsp
4005a5: bb 01 00 00 00 mov $0x1,%ebx
4005aa: 48 8d 4c 24 0c lea 0xc(%rsp),%rcx
4005af: 48 8d 54 24 08 lea 0x8(%rsp),%rdx
4005b4: be 36 07 40 00 mov $0x400736,%esi
4005b9: 48 8b 3d 30 05 10 00 mov 1049904(%rip),%rdi # 500af0 <__bss_start>
4005c0: b8 00 00 00 00 mov $0x0,%eax
4005c5: e8 ce fe ff ff callq 400498
4005ca: 83 f8 02 cmp $0x2,%eax
4005cd: 74 0a je 4005d9
4005cf: b8 00 00 00 00 mov $0x0,%eax
4005d4: e8 af ff ff ff callq 400588
4005d9: b8 01 00 00 00 mov $0x1,%eax
4005de: 3b 44 24 08 cmp 0x8(%rsp),%eax
4005e2: 7d 0d jge 4005f1
4005e4: 8b 54 24 08 mov 0x8(%rsp),%edx
4005e8: 0f af d8 imul %eax,%ebx
4005eb: ff c0 inc %eax
4005ed: 39 d0 cmp %edx,%eax
4005ef: 7c f7 jl 4005e8
4005f1: 39 5c 24 0c cmp %ebx,0xc(%rsp)
4005f5: 74 0a je 400601
4005f7: b8 00 00 00 00 mov $0x0,%eax
4005fc: e8 87 ff ff ff callq 400588
400601: 48 83 c4 10 add $0x10,%rsp
400605: 5b pop %rbx
400606: c3 retq
0000000000400607 :
400607: 48 83 ec 08 sub $0x8,%rsp
40060b: bf 48 07 40 00 mov $0x400748,%edi
400610: e8 a3 fe ff ff callq 4004b8
400615: bf 3c 07 40 00 mov $0x40073c,%edi
40061a: e8 99 fe ff ff callq 4004b8
40061f: b8 00 00 00 00 mov $0x0,%eax
400624: e8 77 ff ff ff callq 4005a0
400629: bf a8 07 40 00 mov $0x4007a8,%edi
40062e: e8 85 fe ff ff callq 4004b8
400633: b8 00 00 00 00 mov $0x0,%eax
400638: 48 83 c4 08 add $0x8,%rsp
40063c: c3 retq
40063d: 90 nop
40063e: 90 nop
40063f: 90 nop
...
Look at the code of explode_bomb; try to figure out what it does.
Running the bomb
The bomb can be invoked by:
./bomb
The program waits for you to enter a string.
You can enter the input from the keyboard, or read it in from a file:
./bomb solution.txt
The bomb then examines the string, and either explodes, or not.
Lab 2 problem statement
What input string should we give
to the program, so that bomb doesn't explode, assuming we don't
have access to program's source code?
GDB (GNU DeBugger)
Now all we need to do is completely understand the assembly code,
and we can defuse the bomb.
In Lab 2, we will be dealing with a lot of code,
which can be difficult to understand.
Even if we do a good job, we might make a mistake and accidentally
detonate the bomb.
This is where the gdb comes in.
It lets us step through the assembly code as it runs,
and examine the contents of registers and memory.
We can also set breakpoints at arbitrary positions in the program.
Breakpoints are points in the code where program execution
is instructed to stop. This way, we can let the debugger
run without interruption over large portions of code,
such as code that we already understand or believe is error-free.
Starting gdb
Start gdb by specifying what executable to debug:
'gdb bomb'
We can run the bomb in the debugger just as we would outside the debugger,
except that we can instruct the program to stop at certain locations
and inspect current values of memory and registers.
As a last resort, we can use (Ctrl-C) to stop the program
and panic out. But this is not recommended and is usually not
necessary, as long as we positioned our breakpoints appropriately.
To start a program inside gdb:
(gdb) run
To start a program inside gdb, with certain input parameters:
(gdb) run parameters
Examples:
(gdb) run < solution.txt
(equivalent to ./bomb < solution.txt , just this time inside gdb)
(gdb) run -d 1
(equivalent to ./bomb -d 1; this is a made-up example in
the speficic case of the bomb program,
as 'bomb' supports no such parameters; this example is meant
to demonstrate how things would work in general)
Exiting gdb
To exit gdb and return to the shell prompt:
(gdb) quit
Note that exiting gdb means you lose all of your breakpoints that
you set in this gdb session. When you re-run gdb, you need
to respecify any breakpoints that you want to re-use.
A common mistake is to forget this and
then let the debugging proceed straight into the bomb_explode()
routine.
Breakpoints
We wouldn't be using gdb if all we did was run the program
without any interruptions.
We need to stop program execution at certain key positions in the code,
and then examine program behavior around those positions.
How do we pick a good location for a breakpoint?
First, we can always set a breakpoint
at 'main', since every C program has a function called 'main'.
In Lab 2, Dr. Evil accidently gave us 'bomb.c'. By examining this
code, we see that we can place a good breakpoint at 'phase_1',
as this is where our input is examined (examine bomb.c).
(gdb) break phase_1
Note: if you mistype the name of the routine, gdb will print a warning and
not set any breakpoints.
Also note that program execution will always stop just BEFORE executing
the instruction you set the breakpoint on.
Another essential breakpoint to set is on the explode_bomb routine:
(gdb) break explode_bomb
For inputs that don't solve the puzzle, this breakpoint will be
your last safeguard before explosion. I recommend ALWAYS setting this
breakpoint. In addition to that, I recommend setting another
breakpoint inside explode_bomb, positioned after the call
to routine that prints "BOOM!", but before the call to routine
that notifies the server of the explosion.
This can be useful if you accidentally enter
explode_bomb, but don't notice that you hit the safeguard breakpoint.
After several hours of debugging, when concentration drops down
in a moment of weakness, it can happen that you
accidentally instruct the program to keep on going.
The second breakpoint will save you.
To set a breakpoint at the machine instruction located at the address 0x401A23:
(gdb) break *0x401A23
Note: don't forget the '0x'. If you forget it, and if you are unlucky
enough that the address doesn't contain any A,B,C,D,E,F characters,
breakpoint address will be interpreted as if given in the
decimal notation. This results in a completely different address
to what was desired, and breakpoint won't work as expected.
To see what breakpoints are currently set:
(gdb) info break
To delete one or more breakpoints:
(gdb) delete <breakpoint number>
Example:
(gdb) delete 4 7
erases breakpoints 4 and 7.
Terminating program execution from within gdb
We can terminate the program at any time:
(gdb) kill
Note that this doesn't exit gdb, and all your breakpoints
remain active. You can re-run the program using the run
command, and all breakpoints still apply.
Stepping through the code
To execute a single machine instruction, use:
(gdb) stepi
Note that if you use 'stepi' on a callq instruction, debugger
will proceed inside the called function.
Also note that pressing <ENTER> re-executes the last
gdb command. To execute several 'stepi' instructions
one after another, type 'stepi' once, and then press <ENTER>
several times in a row.
Sometimes we want to execute a single machine instruction,
but if that instruction is a call to a function, we want
the debugger to execute the function without our intervention.
This is achieved using 'nexti':
(gdb) nexti
Program will be stopped as soon as control returns from the function,
i.e. at the instruction
immediately after callq in the caller function.
If you accidentally use stepi to enter a function call, and you
really don't want to debug that function, you can use 'finish'
to resume execution until the current function returns.
Execution will stop at the machine instruction immediately
after the 'callq' instruction in the caller function, just as
if we had called 'nexti' in the first place:
(gdb) finish
Note: make sure the current function can really be run
safely without your intervention. You don't want it
to call explode_bomb.
To instruct the program to execute (without your intervention)
until the next breakpoint is hit, use :
(gdb) continue
The same warning as in the case of 'finish' applies.
If program contains debugging information (-g switch to gcc;
not the case in Lab 2, but otherwise usually the case ),
we can also step a single C statement:
(gdb) step
Or, if next instruction is a function call, we can use 'next' to
execute the function without our intervention. This is just like
nexti, except that it operates with C code as opposed to machine
instructions:
(gdb) next
Disassembling code using gdb
You can use 'disassemble' to disassemble a function or
a specified address range.
To disassemble function explode_bomb:
(gdb) disassemble explode_bomb
To disassemble the address range from 0x4005dc to 0x4005eb:
(gdb) disassemble 0x4005dc 0x4005eb
Examining registers
To inspect the current values of registers:
(gdb) info registers
This prints out the current values of all registers.
To inspect the current values of a specific register:
(gdb) p $rax
To print the value in hex notation:
(gdb) p/x $rax
Note: using 'p $eax' to print just the lower 32 bits
of the register doesn't work (at least with
the current version of gdb on the fish machines).
You have to print a full 64-bit register.
To see the address of the next machine instruction to
be exectued:
(gdb) frame
or, equivalently, you can inspect the instruction pointer register:
(gdb) p/x $rip
Normally, when debugging a C/C++ program for which the source code
is available (not the case with Lab 2), you can also inspect
the call-stack (a list of all nested function calls that led to
the current function being executed):
(gdb) where
Examining memory
To inspect the value of memory at location 0x400746:
(gdb) x/NFU 0x400746
Here:
N = number of units to display
F = output format (hex=h, signed decimal=d, unsigned decimal=u, string=s, char=c)
U = defines what constitutes a unit: b=1 byte, h=2 bytes, w=4 bytes, g=8 bytes
Note that output format and unit definition characters are mutually distinct from each other.
Examples:
To use hex notation, and print two consecutive 64-bit
words, starting from the address 0x400746 and higher:
(gdb) x/2xg 0x400746
To print a null-terminated string at location 0x400746:
(gdb) x/s 0x400746
To use hex notation, and print five consecutive 32-bit
words, starting from the address 0x400746:
(gdb) x/5xw 0x400746
To print a single 32-bit word, in decimal notation,
at the address 0x400746:
(gdb) x/1dw 0x400746
The source code for the example bomb:
#include <stdio.h>
#include <stdlib.h>
void explode_bomb() {
printf("KABOOM!!!\n");
exit(1);
}
void phase_1_of_1 () {
int args, num, fact;
int i = 0;
int check_fact = 1;
args = fscanf (stdin, "%d %d", &num, &fact);
if (args != 2)
explode_bomb();
for (i = 1; i < num; i++)
check_fact = check_fact * i;
if (fact != check_fact)
explode_bomb();
}
int main() {
printf("Welcome to the demo bomb. In another moment of weakness, Dr. Evil created this demo bomb.\n");
printf ("Phase 1\n");
phase_1_of_1 ();
printf("You safely defused the bomb. Well done.\n");
return 0;
}
Refer to the gdb notes online for a quick reference:
http://www.cs.cmu.edu/afs/cs/academic/class/15213-f06/www/docs/GDB_commands.txt