Justin Spencer: Guide to x86 Assembly by University of Virginia

Guide to x86 Assembly

MASM (Microsoft Macro Assembler) uses the standard Intel syntax for writing x86 assembly code.
The full x86 instruction set is large and complex.
Modern x86 processors have eight 32-bit general purpose registers. The register names are mostly historical.
Whereas most of the registers have lost their special purposes in the modern instruction set, by convention, two are rese4veved for special purposes--the stack pointer (ESP) and the base pointer (EBP).
You can declare static data regions (analogous to global variables) in x86 assembly using special assembler directives for this purpose. Data declarations should be preceded by the .DATA directive. Following this directive, the directives DB, DW, and DD can be used to declare one, two , and four byte data locations, respectively. Declared locations can be labeled with names for later reference--this is similar to declaring variable by name, but abides by some lower level rules.
Unlike in high level languages where arrays can have many dimensions and are accessed by indices, arrays in x86 assembly language are simply a number of cells located contiguously in memory.
The DUP directive tells the assembler to duplicate an expression a given number of times.
Modern x86-compatible processors are capable of addressing up to 2^32 bytes of memory: memory addresses are 32-bits wide.
In general, the intended size of the data item at a given memory address can be inferred from the assembly code instruction in which it is referenced.
Machine instructions generally fall into three categories: data movement, arithmetic/logic, and control-flow.
The mov instruction copies the data item referred to by its second operand into the location referred to by its first operand.
The push instruction places its operand onto the top of the hardware supported stack in memory. Specifically, push first decrements ESP by 4, then places its operand into the contents of the 32-bit location at address [ESP].
ESP (the stack pointer) is decremented by push since the x86 stack grows down--i.e. the stack grows from high addresses to lower addresses.
The pop instruction removes the 4-byte data element from the top of the hardware-supported stack into the specific operand. It first moves the 4 bytes located at memory location [ESP] into the specified register or memory location, and then increments ESP by 4.
The lea instruction places the address specified by its second operand into the register specified by its first operand. Note, the contents of the memory location are not loaded, only the effective address is computed and placed into the register.
The add instruction adds together its two operands, storing the result in its first operand.
The sub instruction stores in the value of its first operand the result of subtracting the value of its second operand from the value of its first operand.
The inc instruction increments the contents of its operand by one.
The dec instruction decrements the contents of its operand by one.
These instructions [and, or, xor] perform the specified logical operation on their operands, placing the result in the first operand location.
not logically negates the operand contents (that is, flips all bit values in the operand).
neg performs the two's complement negation of the operand contents.
This instructions [shl, shr] shift the bits in their first operand's contents left and right, padding the resulting empty bit positions with zeros.
The x86 processor maintains an instruction pointer (EIP) register that is a 32-bit value indicating the location in memory where the current instruction starts. Normally, it increments to point to the next instruction in memory after execution of an instruction. The EIP register cannot be manipulated directly, but is updated implicitly by providing control flow instructions.
We use the notation to refer to balled locations in the program text. Labels can be inserted anywhere in x86 assembly code text by entering a label name followed by a colon.
jmp transfers program control flow of the instruction at the memory location indicated by the operand.
These instructions [conditional jumps] are conditional jumps that are based on the status of a set of condition codes that are stored in a special register called the machine status word. The contents of the machine status word include information about the last arithmetic operation performed.
cmp compares the values of the two specified operands, setting the conditions codes in the machine status word appropriately. This instruction is equivalent to the sub instruction, except the result of the subtraction is discarded instead of replacing the first operand.
These instructions [call, ret] implement a subroutine call and return.
The call instruction first pushed the current code location onto the hardware supported stack in memory, and then performs an unconditional jump to the code location indicated by the label operand. Unlike the simple jump instructions, the call instruction saves the location to return to when the subroutine completes.
The ret instruction implements a subroutine return mechanism. This instruction first pops a code location off the hardware supported in-memory stack,. It then performs an unconditional jump to the retrieved code location.
To allow separate programmers to share code and develop libraries for use by many programs, and to simplify the use of subroutines in general, programmers typically adopt a common calling convention. The calling convention is a protocol about how to call and return from routines.
The C calling convention is based heavily on the use of the hardware-supported stack. It is based on the push, pop, call, and ret instructions. Subroutine parameters are passed on the stack. Registers are saved on the stack, and local variables used by subroutines are placed in memory on the stack. The vast majority of high-level procedural languages implemented on most processors have used similar calling conventions.
The calling convention is broken into two sets of rules. The first set of rules is employed by the caller of the subroutine, and the second set of rules is observed by the writer of the subroutine (the callee).
It should be emphasized that mistakes in the observance of these rules quickly result in fatal program errors since the stack will be left in an inconsistent state; thus meticulous care should be used when implementing the call convention in your own subroutines.
To make a subroutine call, the caller should:

Before calling a subroutine, the caller should save the contents of certain registers that are designated caller-saved. The caller-saved registers are EAX, ECX, EDX. Since the called subroutine is allowed to modify these registers, if the caller relies on their values after the subroutine returns, the caller must push the values in these registers onto the stack (so they can be restored after the subroutine returns).
To pass parameters to the subroutine, push them onto the stack before the call. The parameters should be pushed in inverted order (i.e. last parameter first). Since the stack grows down, the first parameters will be stored at the lowest address (this inversion of parameters was historically used to allow functions to be passed a variable number of parameters).
To call the subroutine, use the call instruction. This instruction places the return address on top of the parameters on the stack, and branches to the subroutine code. This involves the subroutine, which should follow the callee rules below.

After the subroutine returns (immediately following the call instruction), the caller can expect to find the return value of the subroutine in the register EAX. To restore the machine state, the caller should:

Remove the parameters from the stack. This restores the stack to its state before the call was performed.
Restore the contents of caller-saved registers (EAX, ECX, EDX) by popping them off of the stack. The caller can assume that no other registers were modified by the subroutine.

The base pointer is used by convention as a point of reference for finding parameters and local variables on the stack. When a subroutine is executing, the base pointer holds a copy of the stack pointer value from when the subroutine started executing. Parameters and local variables will always be located at known, constant offsets away from the base pointer value.
We push the old base pointer value at the beginning of the subroutine so that we can later restore the appropriate base pointer value for the caller when the subroutine returns.
Allocate local variables by making space on the stack. Recall, the stack grows down, so to make space on the top of the stack, the stack pointer should be decremented. The amount by which the stack pointer is decremented depends on the number and size of local variables needed.
To save registers, push them onto the stack..
Note that the callee's rules fall cleanly into two halves that are basically mirror images of one another. The first half of the rules apply to the beginning of the function, and are commonly said to define the prologue to the function. The latter half of the rules apply to the end of the function, and are thus commonly said to define the epilogue of the function.
The subroutine prologue performs the standard actions of saving a snapshot of the stack pointer in EBP (the base pointer), allocating global variables by decrementing the stack pointer, and saving register values on the stack.
The function epilogue is basically a mirror image of the function prologue. The caller's register values are recovered from the stack, the local variables are deallocated by resetting the stack pointer, the caller's base pointer value is recovered, and the ret instruction is used to return to the appropriate code location in the caller.

Justin Spencer

Pages

20170929

Guide to x86 Assembly by University of Virginia

No comments:

Post a Comment