Justin Spencer: Modern Binary Exploitation

Modern Binary Exploitation

Binary: An executable such as an EXE, ELF, MachO or other code containers that run on a machine.
Malware: A malicious binary meant to persist on a machine such as a rootkit or remote access tool (RAT).
Vulnerability: A bug in a binary that can be leveraged by an exploit.
Exploit: Specially crafted data that utilizes vulnerabilities to force the binary into doing something unintended.
0-Day: A previously unknown or unpatched vulnerability that can be used by an exploit.
If your program simply segfaulted, consider yourself lucky.
The right bugs (vulnerabilities) found in binaries can be used by exploits to hijack code execution.
Compared to modern languages, C is considered a 'low level' language.
It's easy to make grievous errors in C.
All assembly languages are made up of instruction sets.
Instructions are generally simple arithmetic operations that take registers or constant values as arguments.
ESP - stack pointer, "top" of the current stack frame.
EBP - base pointer, "bottom" of the current stack frame.
EIP - instruction pointer, pointer to the next instruction to be executed by the CPU.
Cyber security is one of the fastest growing field in computer science, though its study is rarely covered in academia due to its rapid pace of development and its technical specificity.
Memory corruption is typically at the heart of binary exploitation.
The vast majority if system-level exploits (real-world and competition) involve memory corruption.
Shellcode: A set of instructions that are injeted by the user and executed by the exploited binary. Generally the "payload" of an exploit.
Using shellcode you can essentially make a program execute code that never existed in the original binary. You're basically injecting code.
Shellcode is generally hand coded in assembly, but its functionality can be represented in C.
When shellcode is read as a string, null bytes become an issue with common string functions. So, make your shellcode NULL free!
xor-ing anything with itself clears itself.
There's always more than one way to do things.
System calls are how userland programs talk to the kernel to do anything interesting.
libc functions are high level syscall wrappers.
Like programs, your shellcode needs syscalls to do anything of interest.
Syscalls can be made in x86 using interrupt 0x80: int 0x80
Remember 'nop' (\x90) is an instruction that does nothing.
If you don't know the exact address of your shellcode in memory, pad your exploit with nop instructions to make it more reliable.
Sometimes a program accepts only ASCII characters...so you need alphanumeric shellcode!
Format string based vulnerabilities are less common nowadays, but they are an important bug class that can be tricky to exploit.
A format string is a string with conversion specifiers.
"%n" takes a pointer as an argument, and writes the number of bytes written so far.
Data Execution Prevention (DEP) is one of the pillars of modern exploit mitigation technologies. Understanding how DEP works and how it can be bypassed is important in exploiting real world targets.
DEP can be bypassed through Return Oriented Programming (ROP).
Data Execution Prevention -- An exploit mitigation technique used to ensure that only code segments are ever marked as executable. Meant to mitigate code injection/shellcode payloads.
No segment of memory should ever be writable and executable at the same time.
Data should never by executable, only code.
Technologies in modern exploit mitigations are incredibly young, and the field of computer security is rapidly evolving.
DEP is one of the main mitigation technologies you must bypass in modern exploitation.
DEP stops an attacker from easily executing injected shellcode assuming they gain control of EIP.
If you can't inject (shell) code to do your bidding, you must re-use the existing code! This technique is usually some form of ROP.
Return Oriented Programming -- A technique in exploitation to reuse existing code gadgets in a target binary as a method to bypass DEP.
Gadget--A sequence of meaningful instructions typically followed by a return instruction.
Usually multiple gadgets are chained together to compute malicious actions like shellcode does. These chains are called ROP chains.
Preventing the introduction of malicious code is not enough to prevent the execution of malicious computations.
It is almost always possible to create a logically equivalent ROP chain for a given piece of shellcode.
Typically in modern exploitation you might only get one targeted overwrite rather than a straight stack smash.
Stack Pivot--Use a gadget to move ESP into a more favorable location.
Any gadgets that touch ESP will probably be of interest for a pivot scenario.
You can always pad your ROP chains with ROP nops, which are simply gadgets that point to ret's.
'ret2libc' is a technique of ROP where you return to functions in standard libraries (libc), rather than using gadgets.
Game consoles are among the most secure off the shelf products consumers can buy, so it's interesting to look at the technical aspects of the exploits and bugs that cracked them open.
Fewer bugs != more secure. One bug is still enough to blow things wide open.
SELinux (Security Enhanced) is an implementation of mandatory access controls (MAC) on Linux. Mandatory access controls allow an administrator of a system to define how applications and users can access different resources such as files, devices, networks and inter-process communication.
Updates don't always patch bugs, sometimes they introduce them.
ASLR is the second big pillar in modern exploit mitigation technologies. It's designed to mitigate exploits that rely on hard coded code/stack/heap addressed by randomizing the layout of memory for every execution.
Address Space Layout Randomization (ASLR)--An exploit mitigation technology used to ensure that address ranges for important memory segments are random for every exection.
A simple stack smash may get you control of EIP, but what does it matter if you have no idea where you can go with it.
Reminder: Security is rapidly evolving.
Position Independent Executable--Executables compiled such taht their base address does not matter, 'position independent code'.
An info leak is when you can extract meaningful information (such as a memory address) from the ASLR protected service or binary.
If you can leak any sort of pointer to code during your exploit, you have likely defeated ASLR.
Given a single pointer into a memory segment, and you can compute the location of everything around it.
Info leaks are the most used ASLR bypass in real world exploitation as they give assurances.
Don't bother to try brute-forcing addresses on a 64-bit machine of any kind.
Ubuntu ASLR is rather weak, low-entropy.
Like other mitigation technologies, ASLR is a "tack on" solution that only makes things ahrder.
The vulnerabilities and exploits become both more complex and precise the deeper down the rabbit hole we go.
DEP & ASLR are the two main pillars of modern exploit mitigation technologies.
Many exploits found in the wild today likely touch on the heap in some form. As stack based memory corruption has grown harder to utilize, the bug hunt has continued into the heap space and brought rise to new classes of vulnerabilities and techniques.
The heap is a pool of memory used for dynamic allocations at runtime.
Everyone uses the heap (dynamic memory) but few usually know much about its internals.
Heap grows down towards higher memory.
Stack grows up towards lower memory.
Buffer overflows are basically the same on the heap as they are on the stack.
Heap cookies/canaries aren't a thing because there are no 'return' addresses to protect.
In the real world, lots of cool and complex things like objects/structs end up on the heap.
It's common to put function pointers in structs which generally are malloc'd on the heap.
Use After Free (UAF)--A class of vulnerability where date on the heap is freed, but a leftover reference or 'dangling pointer' is used by the code as if the data were still valid.
To exploit a UAF, you usually have to allocate a different type of object over the one you just freed.
UAF vulnerabilities don't require any memory corruption to use.
Heap Spraying--A technique used to increase exploit reliability, by filling the heap with large chunks of data relevant to the exploit you're trying to land.
A heap spray is not a vulnerability or security flaw.
Usually heap sprays are done in something like JavaScript placed on a malicious HTML page.
Metadata corruption based exploits involve corrupting heap metadata in such a way that you can use the allocator's internal functions to cause a controlled write of some sort.
Heap metadata corruption based exploits are usually very involved and require more intimate knowledge of heap internals.
Metadata exploits are hard to pull off nowadays as heaps are fairly hardened.
A signed integer can be interpreted as positive or negative.
An unsigned integer is only ever zero and up.
A signed int uses the top bit to specify if it is a positive or negative number.
Variable types are known at compile time, so signed instructions are compiled in to handle your variable.
It's very common to see modern bugs stem from integer confusion and misuse.
In Linux, everything is a file.
A stack canary is a random integer, pushed onto the stack after certain triggers are pushed, and popped off stack and checked before the trigger is read from.
C++ adds a number of conveniences that C lacks. Some of these additions help mitigate common exploitation avenues that we are used to such as string mishandling.
Use of C++ std::string removes a lot of potential memory corruption introduced by C-style strings.
Inheritance introduces (non-standard) complexity.
VTables enable polymorphism.
Kernel exploitation is the process of attacking the operating system itself. Vulnerabilities in the kernel can result in full takeover of a system and are among the most powerful bugs we can find.
Userspace is an abstraction that runs "on top" of the kernel.
The kernel is the core of the operating system.
"jail breaking" or "rooting" devices often depends on finding and leveraging kernel bugs.
Your kernel is:

Managing your processes
Managing your memory
Coordinating your hardware

The kernel is typically the most powerful place we can find bugs.
Basic Kernel Exploit strategy:

Find vulnerability in kernel code.
Manipulate it to gain code execution.
Elevate our process's privilege level.
Survive the "trip" back to userland.
Enjoy our root privileges.

Kernel vulnerabilities are almost exactly the same as userland vulnerabilities.
The most common place to find vulnerabilities is inside of Loadable Kernel Modules (LKMs).
LKMs are like executables that run in Kernel Space.
Remember: the kernel manages running processes.
Most useful things we want to do are much easier from userland.
For exploitation, the easiest strategy is hijacking execution, and letting the kernel reutrn by itself.
Kernel exploitation is weird, but extremely powerful.
x86 is like the wild west in computing--it's like it was designed to be exploited.
We're well into the 64-bit era at this point with 32-bit x86 machines slowly on their way out.
64-bit addresses almost always have a NULL upper byte, meaning ROP chains and string functions don't get along.
WinDBG is Microsoft's debugger.
Syscall numbers tend to change from version to version of Windows and would be hard or unreliable to code into an exploit.
Windows XP SP2 marked the start of the modern security era.
Structured Exception Handling is a lot like assigning signal handlers on Linux.
Exception records are placed on the stack, so they're relatively easy to corrupt.
Windows based exploitation isn't too different from Linux, but it's quickly getting harder.
Systems and applications will never be perfectly secure. Period. They just have to be hard enough to break that nobody can afford it anymore.
The entry bar for binary exploitation is rising faster and faster.
Implementation & logic flaws will probably always exist--you can't really fix stupid.
Source code analyzers can help find bugs statically, but they can also miss a lot.
Fuzzing--the act of mangling data and throwing it at a target application to see if ti mishandles it in some fashion.
Fuzzing has probably been the source of over 95% of the bugs from the past 10 years.
The fuzzing era is starting to wind down.
American Fuzzy Lop (AFL)--A 'security-oriented' fuzzer that inserts and utilizes instrumentation that it inserts at compile time.
Many modern bugs have to be 'forced' by requiring very specific conditions--like some sort of crazy edge cases.

Justin Spencer

Pages

20180218

Modern Binary Exploitation

No comments:

Post a Comment