Lecture 2 Reverse Engineering
Reverse Engneering: Why?
Reversing: Pulling apart existing software, figuring out how it works, and then breaking it
- Security Testing: to verify something fits the specification, or find vulns.
- Malware: Work out how it works and stop it
- Can patch and remove jumps etc.
- Some crypto can be easily bypassed
Compilation
- Source code > Compiler > ASM > Assembler > Machine Code > Linker > Program
- Program > Loader > Process
x86 Architecture
Syntax:
Use Intel (AT&T)
echo "set disassembly-flavor intel" >> ~/.gdbinit
Registers

‘E’ stands for ‘extended’.
Register Conventions
EBX: “Base index” for arraysECX: “Counter”EIP: Instruction PointerEAX,EBX,ECX,EDX,ESI,EDI: General PurposeEAX: is frequently used as the first argument of a funtion call, and as return valueESP: “Stack Pointer”EBP: “Base Pointer”ESI: “Source”EDI: “Destination
A Basic Function Call

void drawSquare (args) {
// stuff
drawLine(more, args);
// more stuff
}
drawSquare();
Assembly Process for Calling Functions
PUSH EBP,
MOV EBP, ESP
# reserve space for arguments
ADD ESP, 8 # e.g. 8 byes for 2x4byte numbers
Instructions
| Instruction | Description |
|---|---|
NOP |
0x90: No Operation |
PUSH eax |
pushes value in eax onto top of stack |
POP eax |
takes value on top of stack and pops it into eax |
MOV |
Moves data. [] mean dereference address in register |
ADD |
Adds second argument to the first argument |
SUB |
Subtract |
LEA |
Load Effective Address: load the address of the source into destination |
JMP |
Essentially set EIP to a certain address. Jumps are relative unless you specify a register |
CALL |
Can be relative or non-relative. Essentially performs PUSH eip then JMP addr |
RET |
Is the opposite of CALL: it is essentially “POP eip”. Can take an argument (which means cleans up n bytes on the stack, then pops eip - rarely use). |
INT, SYSENTER, or SYSCALL |
Perform syscall |
REP |
Repeat instruction. Takes argument to which instruction you want to repeat |
AND, OR etc. |
Operations |
TEST |
same as an AND: doesn’t save values, sets 0 flag |
CMP |
same as SUB: doesn’t save values, sets 0 flag |
| Conditional Jumps | Based on some flag: e.g. jump if zero, jump if equal |
Calling Conventions
fastcall
- Uses registers to pass arguments
- Common in windows
stdcall
- Pushes the arguments onto the stack
- Callee is responsible for cleaning up the stack
cdecl
- Pushes argument onto the stack
- Caller is responsible for cleaning up the stack
- This is what linux uses
- Return value is normally in
eax
Stack Protections
checksec is a pwntools command to check what protections are in place on a binary.
Stack Canary/Cookie
- Random number before protected area on stack (such as return address)
- If the random number has changed, the program will crash instead of return
- Bypass
- Find out the canary value and just write it in the exploit so it still apperas the same
- Format of Stack Canaries:
- On a 32-bit machine, it’s a 4byte value with a null termination: this makes it hard to exploit via string copy functions since strings will all end in a null
- Randomised on Program Start: but every function will have the same canary during execution
GS:0x14is accessing a struct containinguintptr_t stack_guard;, which is the canary
Variable Reordering
- During compilation, varibales are reordered to make buffers at the bottom of the stack, with other variables on top (so other vars can’t be overwritten)
ASLR
- Still to come…
__x86.get_pc_thunk.bx:
- Get program counter
- Way of findin th ebase of the binary
- “Thunk”: subroutine used to inject additional calculation into another subroutine.
NX: Non-executable Stack
- Should never have a region of memory that is writeable AND executable
- Can be disabled and isn’t used in some places
- e.g. all browsers need to be able to write and then execute: case in point: javascript
Reverse Engineering Techniques
Structures & Arrays
Array
mov eax, [esi+edi*4+18h] ; accessing an entry of the array esi at index edi. The 18h is because this example is an array inside of a struct (0x18 is start of array in struct)
struct mystruct {
int age;
char * name;
struct mystruct* next;
int uid;
unt gid;
int array[100];
}
So the above assembly is the same as this:
struct.array[edi];
Recognising Data Structures If a function is already reversed, can use this knowledge to work out data structuers of caller functions, since they will have to pass in arguments of the correct type.
Switch Statements
- Forms an array of addresses ‘jump table’
- If the cases do not start at 0, then there is a subtraction so that the same jump table/offsets can be used
- If it is not an integer, then it will use a less/grrater than tree
Recommended Reading
- Smashing the STack for Fun and Profit (Phrack)