Justin Spencer

20180219

Computer Science from the Bottom Up

An often -quoted tenet of UNIX-like systems such as Linux or BSD is everything is a file.
The concept of a file is a good abstraction of either a sink for, or source of, data. As such it is an excellent abstraction of all the devices one might attach to the computer.
No one person can understand everything from designing a modern user-interface to the internal workings of a modern CPU, much less build it all themselves. To programmers, abstractions are the common language that allows us to collaborate and invent.
Learning to navigate across abstractions gives on greater insight into how to use the abstractions in the best and most innovative ways.
In general, abstraction is implemented by what is generically termed an Application Programming Interface (API).
A common method used in the Linux kernel and other large C code bases, which lack a built-in concept of object-orientation, is function pointers. Learning to read this idiom is key to navigating most large C code bases. By understanding how to read the abstractions provided within the code an understanding of internal API designs can be built.
Libraries have two roles which illustrate abstraction.

Allow programmers to reuse commonly accessed code.
Act as a black box implementing functionality for the programmer.

The standard library of a UNIX platform is generically referred to as libc. It provides the basic interface to the system: fundamental calls such as read(), write(), and printf(). This API is described in its entirety by a specification called POSIX.
Libraries are a fundamental abstraction with many details.The value returned by an 'open' call is termed a file descriptor and is essentially an index into an array of open files kept by the kernel.
Starting at the lowest level, the operating system requires a programmer to create a device driver to be able to communicate with a hardware device. This device driver is written to an API provided by the kernel; the device driver will provide a range of functions which are called by the kernel in response to various requirements.
To provide the abstraction to user-space, the kernel provides a file-interface via what is generically termed a device layer. Physical devices on the host are represented by a file in a special file system such as /dev.
Mounting a file system has the dual purpose of setting up a mapping so the file system knows ht underlying device that provides the storage and the kernel knows that files opened under that mount-point should be directed to the file system driver.
The shell is the gateway to interacting with the operating system.
The 'pipe' is an in-memory buffer that connects two processes together, file descriptors point to the pipe object, which buffers data sent to it (via a write) to be drained (via a read).
Writes to the pipe are stored by the kernel until a corresponding read from the other side drains the buffer. This is a very powerful concept and is one of the fundamental forms of inter-process communication or IPC in UNIX-like operating systems.
Binary is a base-2 number system that uses two mutually exclusive states to represent information. A binary number is made up of elements called bits where each bit can be in one of the two possible states.
We can essentially choose to represent anything by a number, which can be concerted to binary and operated on by the computer.
Parity allows a simple check of the bits of a byte to ensure they were read correctly. We can implement either odd or even parity by using the extra bit as a parity bit. In odd parity, if the number of 1' in the information is odd, the parity bit is set, otherwise it is not set. Even parity is the opposite; if the number of 1's is even the parity bit is set to 1.
It can be very useful to commit base-2 factors to memory as an aide to quickly correlate the relationship between number-of-bits and "human" sizes.
Electronically, the Boolean operations are implemented in gates made by transistors.
Computers only ever deal in binary and hexadecimal is simply a shortcut for us humans trying to work with the computer.
In low level code, it is often important to keep your structures and variables as space efficient as possible. In some cases, this can involve effectively packing two (generally related) variables into one.
Often a program will have a large number of variables that only exist as flags to some condition.
C is the common language of the systems programming world. Every operating system and its associate system libraries in common use is written in C, and every system provides a C compiler.
The "glue" between the C standard and the underlying architecture is the Application Binary Interface (or ABI) which we discuss below.
In a typed language, such as C, every variable must be declared with a type. The type tells the computer about what we expect to store in a variable; the compiler can then both allocate sufficient space for this usage and check that the programmer does not violate the rules of the type.
The C99 standard purposely only mentions the smallest possible size of each of the types defined for C. This is because across different processor architectures and operating systems the best size for types can be wildly different.
To be completely safe, programmers need to never assume the size of any of their variables.
Pointers are really just an address (i.e. their value is an address and thus "points" somewhere else in memory) therefore a pointer needs to be sufficient in size to be able to address any memory in the system.
A 64-bit variable is so large that it is not generally required to represent many variables.
'signed' and 'unsigned' are probably the two most important qualifiers; and they say if a variable can take on a negative value or not.
Qualifiers are all intended to pass extra information about how the variable will be used to the compiler. This means two things; the compiler can check if you are violating your own rules and it can make optimizations based upon the extra knowledge.
By implementing two's complement hardware designers need only provide logic for addition circuits; subtraction can be done by two's complement negating the value to be subtracted and then adding the new value. Similarly you could implement multiplication with repeated addition and division with repeated subtraction. Consequently two's complement can reduce all simple mathematical operations down to addition!
All modern computers use two's complement representation.
Because of two's complement format, when increasing the size of a signed value, it is important that the additional bits be sign-extended; that is, copied from the top-bit of the existing value.
To create a decimal number, we require some way to represent the concept of the decimal place in binary. The most common scheme for this is known as the IEEE-754 floating point standard because the standard.
The CPU performs instructions on values held in registers.
To greatly simplify, a computer consists of a central processing unit (CPU) attached to memory.
The CPU executes instructions read from memory. There are two categories of instructions:

Those that load values from memory registers and store values from registers to memory.
Those that operate on values stored in registers. For example adding, subtracting, multiplying or dividing the values in two registers, performing bitwise operations, or performing other mathematical operations.

Internally, the CPU keeps a record of the next instruction to be executed in the instruction pointer. Usually, the instruction pointer is increment to point to the next instruction sequentially; the branch instruction will usually check if a specific register is zero or if a flag is set and, if so, will modify the pointer to a different address. Thus the next instruction to execute will be from a different part of program; this is how loops and decision statements work.
Executing a single instruction consists of a particular cycle of events.

Fetch: get the instruction from memory into the processor.
Decode: internally decode what is has to do.
Execute: take the values from the registers, actually add them together.
Store: store the result back into another register.

The CPU has two main types of registers, those for integer calculations and those for floating point calculations.
A register file is the collective name for the registers inside the CPU.
The Arithmetic Logic Unit (ALU) is the heart of the CPU operation. It takes values in registers and performs any of the multitude of operations the CPU is capable of. All modern processors have a number of ALUs so each can be working independently.
The Address Generation Unit (AGU) handles talking to cache and main memory to get values into the registers for the ALU to operation on and get values out of registers back into main memory.
If you require 'acquire' semantics this means that for this instruction you must ensure that the results of all previous instructions have been completed. If you require release semantics you are saying that all instructions after this one must see the current result.
As we know from the memory hierarchy, registers are the fastest type of memory and ultimately all instructions must be performed on values held in registers, so all other things being equal more registers leads to higher performance.
The CPU can only directly fetch instructions and data from cache memory, located directly on the processor chip. Cache memory must be loaded in from the main system memory (RAM). RAM however, only retains its contents when the power is on, so needs to be stored on more permanent storage.
Cache memory is memory actually embedded inside the CPU.
The important point to know about the memory hierarchy is the trade offs between speed and size--the faster the memory the smaller it is.
The reason caches are effective is because computer code generally exhibits two forms of locality:

Spatial locality suggests that data within blocks is likely to be accessed together.
Temporal locality suggests that data that was used recently will likely be used again shortly.

Cache is one of the most important elements of the CPU architecture.
When data is only read from the cache there is no need to ensure consistency with main memory. However, when the processor starts writing to cache lines it needs to make some decisions about how to update the underlying main memory.

A write-through cache will write the changes directly into the main system memory as the processor updates the cache.
A write-back cache delays writing the changes to RAM until absolutely necessary.

To quickly decide if an address lies within the cache it is separated into three parts; the tag and the index and the offset.
Peripherals are any of the many external devices that connect to your computer.
The communication channel between the processor and the peripheral is called a bus.
A device requires both input and output to be useful.
An interrupt allows the device to literally interrupt the processor to flag some information. Each device is assigned an interrupt by some combination of the operating system and BIOS.
Device are generally connected to an programmable interrupt controller (PIC), a separate chip that is part of the motherboard which buffers and communicates interrupt information to the main processor. Each device has a physical interrupt line between it and one of the PIC's provided by the system. When the device wants to interrupt, it will modify the voltage on this line.
A very broad description of the PIC's role is that it receives this interrupt and converts it to a message for consumption by the main processor. While the exact procedure varies by architecture, the general principle is that the operating system has configured an interrupt descriptor table which pairs each of the possible interrupts with a code address to jump to when the interrupt is received.
Writing the interrupt handler is the job of the device driver author in conjunction with the operating system.
A generic overview of handling an interrupt. The device raises the interrupt to the interrupt controller, which passes the information onto the processor. The processor looks at its descriptor table, filled out by the operating system, to find the code to handle the fault.
Most drivers will split up handling of interrupts into bottom and top halves. The bottom half will acknowledge the interrupt, queues actions for processing and return the processor to what it was doing quickly. The top half will then run later when the CPU is free and do the more intensive processing. This is to stop an interrupt hogging the entire CPU.
While an interrupt is generally associated with an external event from a physical device, the same mechanism is useful for handling internal system operations.
There are two main ways of signalling interrupts on a line--level and edge triggered.
It is important for the system to be able to mask or prevent interrupts at certain times. Generally, it is possible to put interrupts on hold, but a particular class of interrupts, called non-maskable interrupts (NMI), are the exception to this rule.
The most common form of IO is called memory mapped IO where registers on the device are mapped into memory. This means that to communicate with the device, you need simply read or write to a specific address in memory.
Direct Memory Access (DMA) is a method of transferring data directly between peripheral and system RAM.
Snooping is where a processor listens on a bus, which all processors are connected to, for cache events, and updates its cache accordingly.
Having [multiple] processors all on the same bus starts to present physical problems. Physical properties of wires only allow them to be laid out at certain distances from each other and to only have certain lengths. With processors that run at many gigahertz, the speed of light starts to become a real consideration in how long it takes messages to move around a system.
Much of the time of a modern processor is spent waiting for much slower devices in the memory hierarchy to deliver data for processing. Thus strategies to keep the pipeline of the processor full are paramount.
A cluster is simply a number of individual computers which have some ability to talk to each other. At the hardware level the systems have no knowledge of each other; the task of stitching the individual computers together is left up to software.
Programmers need to use techniques such as profiling to analyze the code paths taken and what consequences their code is causing for the system to extract best performance.
Programmers use a higher level of abstraction called locking to allow simultaneous operation of programs when there are multiple CPUs. When a program acquires a lock over a piece of code, no other processor can obtain the lock until it is release. Before any critical pieces of code, the processor must attempt to take the lock; if it can not have it, it does not continue.
Locking schemes make programming more complicated, as it is possible to deadlock programs.
A simple lock that simply has two states--locked or unlocked--is referred to as a mutex (short for mutual exclusion; that is if one person has it the other can not have it).
The fundamental operation of the operating system (OS) is to abstract the hardware to the programmer and user. The operating system provides generic interfaces to services provided by the underlying hardware.
Processes the kernel is running live in userspace, and the kernel talks both directly to hardware and through drivers.
The kernel is the operating system.
Just as the kernel abstracts the hardware to user programs, drivers abstract hardware to the kernel.
The Linux kernel implements a module system, where drivers can be loaded into the running kernel "on the fly" as they are required.
System calls are how userspace programs interact with the kernel.
Each and every system call has a system call number which is known by both the userspace and the kernel.
The Application Binary Interface (ABI) is very similar to an API but rather than being software is for hardware.
Ensuring the application only accesses memory it owns is implemented by the virtual memory system. The essential point is that the hardware is responsible for enforcing these rules.
The process ID (or the PID) is assigned by the operating system and is unique to each running process.
Program code and data should be kept separately since they require different permission from the operating system and separation facilitates sharing of code. The operating system needs to give program code permission to be read and executed, but generally not written to. On the other hand (variables) require read and write permission but should not be executable.
Stacks are fundamental to function calls. Each time a function is called it gets a new "stack frame". This is an area of memory which usually contains, at a minimum, the address to return to when complete, the input arguments to the function and space for local variables.
Stacks are ultimately managed by the compiler, as it is responsible for generating the program code. To the operating system the stack just looks like any other area of memory for the process.
To keep track of the current growth of the stack, the hardware defines a register as the stack pointer.
The heap is an are of memory that is managed by the process for on the fly memory allocation. This is for variables whose memory requirements are not known at compile time.
The bottom of the heap is known as the brk, so called for the system call which modifies it. By using the brk call to grow the area downwards the process can request the kernel allocate more memory for it to use.
The heap is most commonly managed by the malloc library call. This makes managing the heap easy for the programmer by allowing them to simply allocate and free heap memory.
Due to the complexity of managing memory correctly, it is very uncommon for any modern program to have a reason to call brk directly.
Whilst the operating system can run many processes at the same time, in fact it only every directly starts one process called the init (short for initial) process. This isn't a particularly special process except that its PID is always 0 and it will always be running. All other processes can be considered children of this initial process.
The return value from the system call is the only way the process can determine if ti was the existing process or a new one. The return value to the parent process will be the Process ID (PID) of the child, whilst the child will get a return value of 0.
Separate processes can not see each others memory. They can only communicate with each other via other system calls. Thread however, share the same memory. So you have the advantage of multiple processes, with the expense of having to use system calls to communicate between them.
A running system has many processes, maybe even into the hundreds or thousands. The part of the kernel that keeps track of all these processes is called the scheduler because it schedules which process should be run next.
Scheduling strategies can broadly fall into two categories.

Co-operative scheduling is where the currently running process voluntarily gives up executing to allow another process to run.
Preemptive scheduling is where the process is interrupted to stop it to allow another process to run.

Hard-realtime system make guarantees about scheduling decisions like the maximum amount of time a process will be interrupted before it can run again. They are often used in life critical applications like medical, aircraft and military applications.
Big-O notation is a way of describing how long an algorithm takes to run given increasing inputs.
On a UNIX system, the shell is the standard interface to handling a process on your system.
The primary job of the shell is to help the user handle starting, stopping, and otherwise controlling processes running in the system.
Processes running in the system require a way to be told about events that influence them. On UNIX there is infrastructure between the kernel and processes called signals which allows a process to receive notification about events important to it.
When a signal is sent to a process, the kernel invokes a handler which the process must register with the kernel to deal with that signal. A handler is simply a designated function in the code that has been written to specifically deal with the interrupt.
Virtual memory is all about making use of address space.
The address space of a processor refers the range of possible addresses that it can use when loading and storing to memory. The address space is limited by the width of the registers, since as we know to load an address we need to issue a load instruction with the address to load from stored in a register.
Every program compiled in 64-bit mode requires 8-byte pointers, which can increase code and data size, and hence impact both instruction and data cache performance.
While 64-bit processors have 64-bit wide registers, systems generally do not implement all 64-bits for addressing--it is not actually possible to do load or store to all 16 exabytes of theoretical physical memory.
As with most components of the operating system, virtual memory acts as an abstraction between the address space and the physical memory available in the system. This means that when a program uses an address that address does not refer to the bits in an actual physical location in memory.
All addresses a program uses are virtual. The operating system keeps track of virtual addresses and how they are allocated to physical addresses. When a program does a load or store from an address, the processors and operating system work together to convert this virtual address to the actual address in the system memory chips.
The total address-space is divided into individual pages. Pages can be many different sizes; generally they are around 4Kib. The page is the smallest unit of memory that the operating system and hardware can deal with.
Just as the operating system divides the possible address space up into pages, it divides the available physical memory up into frames. A frame is just the conventional name for a hunk of physical memory the same size as the system page size.
The operating system keeps a frame-table which is a list of all possible pages of physical memory and if they are free (available for allocation) or not). When memory is allocated to a process, it is marked as used in the frame-table. In this way, the operating-system keeps track of all memory allocations.
It is the job of the operating system to keep track of which virtual-page points to which physical frame. This information is kept in a page-table which, in its simplest form, could simpy be a table where each row contains its associate frame--this is termed a linear page-table.
Virtual address translation refers to the process of finding out which physical page maps to which virtual page.
By giving each process its own page table, every process can pretend that it has access to the entire address space available from the processor.
IN a system without virtual memory, every process has complete access to all of system memory. This means that there is nothing stopping one process from overwriting another processes memory, causing it to crash (or perhaps worse).
System that use virtual memory are inherently more stable, because, assuming the perfect operating system, a process can only crash itself and not the entire system.
Virtual memory is necessarily quite dependent on the hardware architecture, and each architecture has its own subtleties.
All processors have some concept of either operating in physical or virtual mode. In physical mode, the hardware expects that nay address will refer to an address in actual system memory. In virtual mode, the hardware knows that addresses will need to be translated to find their physical address.
Segmentation is really only interesting as a historical note, since virtual memory has made it less relevant.
In segmentation there are a number of registers which hold an address that is the start of a segment. The only way to get to an address in memory is to specify it as an offset from one of these segment registers.
The Translation Lookaside Buffer (TLB) is the main component of the processor responsible for virtual-memory. It is a cache of virtual-page to physical-frame translations inside the processor. The operating system and hardware work together to manage the TLB as the system runs.
A program that can be loaded directly into memory needs to be in a straight binary format. The process of converting source code, written in a language such as C, to a binary file ready to be executed is called compiling.
A compiled program is completely dependent on the hardware of the machine it is compiled for, since it must be able to simply be copied to memory and executed. A virtual machine is an abstraction of hardware into software.
The linking process is really two steps; combining all object files into one executable file and then going through each object file to resolve any symbols. This usually requires two passes; one to read all the symbol definitions and take not of unresolved symbols and a second to fix up all those unresolved symbols to the right place.
At a minimum, any executable file format will need to specify where the code and data are in the binary file. These are the two primary sections within an executable file.
The common thread between all executable file formats is that they include a predefined, standardized header which describes how program code and data are stored in the rest of the file.
a.out is a very simple header format that only allows a single data, code, and BSS section. This is insufficient for modern systems with dynamic libraries.
The ELF specification provides for symbol tables which are simply mappings of strings (symbols) to locations in the file. Symbols are required for linking.
A relocation is simply a blank space left to be patched up later.
Sections are a way to organize the binary into logical areas to communicate information between the compiler and the linker.
The .bss section is defined for global variables whose value should be zero when the program starts.
A library is simply a collection of functions which you can call from your program.
A shared library is a library that is loaded dynamically at runtime for each application that requires it.
Dynamic linking is one of the more intricate parts of a modern operating system.
A core dump is simply a complete snapshot of the program as it was running at a particular time.
ABI's refer to lower level interfaces which the compiler, operating system, and to some extent processor must agree on to communicate together.
The kernel needs to communicate some things to programs when they start up; namely the arguments to the program, the current environment variables, and a special structure called the Auxiliary Vector or auxv. The kernel communicates this by putting all the required information on the stack for the newly created program to pick up. Thus when the program starts it can use its stack pointer to find all the startup information required.
The auxiliary vector is a special structure that is for passing information directly from the kernel to the newly running program.
Libraries are very much like a program that never gets started. They have code and data sections (functions and variables) just like every executable; but no where to start running. They just provide a library of functions for developers to call.
The dynamic linker is the program that manages shared dynamic libraries on behalf of an executable. It works to load libraries into memory and modify the program at runtime to call the functions in the library.
The essential part of the dynamic linker is fixing up addresses at runtime, which is the only time you can know for certain where you are loaded in memory. A relocation can simply be thought of as a note that a particular address will need to be fixed at load time. Before the code is ready to run you will need to go through and read all the relocations and fix the addresses it refers to to point to the right place.
All libraries must be produced with code that can execute no matter where it is put into memory, known as position independent code (PIC).
The important points to remember [about dynamic linking] are:

Library calls in your program actually call a stub of code in the PLT of the binary.
That stub code loads an address and jumps to it.
Initially, that address points to a function in the dynamic linker which is capable of looking up the "real" function, given the information in the relocation entry for that function.
The dynamic linker re-writes the address that the stub code reads, so that the next time the function is called it will go straight to the right address.

With only static libraries there is much less potential for problems, as all library code is built directly into the binary of the application.
The binding of a symbol dictates its external visibility during the dynamic linking process. A local symbol is not visible outside the object file it is define in. A global symbol is visible to other object files, and can satisfy undefined references in other objects.
A weak reference is a special type of lower priority global reference. This means it is designed to be overridden.

20180218

Modern Binary Exploitation

Binary: An executable such as an EXE, ELF, MachO or other code containers that run on a machine.
Malware: A malicious binary meant to persist on a machine such as a rootkit or remote access tool (RAT).
Vulnerability: A bug in a binary that can be leveraged by an exploit.
Exploit: Specially crafted data that utilizes vulnerabilities to force the binary into doing something unintended.
0-Day: A previously unknown or unpatched vulnerability that can be used by an exploit.
If your program simply segfaulted, consider yourself lucky.
The right bugs (vulnerabilities) found in binaries can be used by exploits to hijack code execution.
Compared to modern languages, C is considered a 'low level' language.
It's easy to make grievous errors in C.
All assembly languages are made up of instruction sets.
Instructions are generally simple arithmetic operations that take registers or constant values as arguments.
ESP - stack pointer, "top" of the current stack frame.
EBP - base pointer, "bottom" of the current stack frame.
EIP - instruction pointer, pointer to the next instruction to be executed by the CPU.
Cyber security is one of the fastest growing field in computer science, though its study is rarely covered in academia due to its rapid pace of development and its technical specificity.
Memory corruption is typically at the heart of binary exploitation.
The vast majority if system-level exploits (real-world and competition) involve memory corruption.
Shellcode: A set of instructions that are injeted by the user and executed by the exploited binary. Generally the "payload" of an exploit.
Using shellcode you can essentially make a program execute code that never existed in the original binary. You're basically injecting code.
Shellcode is generally hand coded in assembly, but its functionality can be represented in C.
When shellcode is read as a string, null bytes become an issue with common string functions. So, make your shellcode NULL free!
xor-ing anything with itself clears itself.
There's always more than one way to do things.
System calls are how userland programs talk to the kernel to do anything interesting.
libc functions are high level syscall wrappers.
Like programs, your shellcode needs syscalls to do anything of interest.
Syscalls can be made in x86 using interrupt 0x80: int 0x80
Remember 'nop' (\x90) is an instruction that does nothing.
If you don't know the exact address of your shellcode in memory, pad your exploit with nop instructions to make it more reliable.
Sometimes a program accepts only ASCII characters...so you need alphanumeric shellcode!
Format string based vulnerabilities are less common nowadays, but they are an important bug class that can be tricky to exploit.
A format string is a string with conversion specifiers.
"%n" takes a pointer as an argument, and writes the number of bytes written so far.
Data Execution Prevention (DEP) is one of the pillars of modern exploit mitigation technologies. Understanding how DEP works and how it can be bypassed is important in exploiting real world targets.
DEP can be bypassed through Return Oriented Programming (ROP).
Data Execution Prevention -- An exploit mitigation technique used to ensure that only code segments are ever marked as executable. Meant to mitigate code injection/shellcode payloads.
No segment of memory should ever be writable and executable at the same time.
Data should never by executable, only code.
Technologies in modern exploit mitigations are incredibly young, and the field of computer security is rapidly evolving.
DEP is one of the main mitigation technologies you must bypass in modern exploitation.
DEP stops an attacker from easily executing injected shellcode assuming they gain control of EIP.
If you can't inject (shell) code to do your bidding, you must re-use the existing code! This technique is usually some form of ROP.
Return Oriented Programming -- A technique in exploitation to reuse existing code gadgets in a target binary as a method to bypass DEP.
Gadget--A sequence of meaningful instructions typically followed by a return instruction.
Usually multiple gadgets are chained together to compute malicious actions like shellcode does. These chains are called ROP chains.
Preventing the introduction of malicious code is not enough to prevent the execution of malicious computations.
It is almost always possible to create a logically equivalent ROP chain for a given piece of shellcode.
Typically in modern exploitation you might only get one targeted overwrite rather than a straight stack smash.
Stack Pivot--Use a gadget to move ESP into a more favorable location.
Any gadgets that touch ESP will probably be of interest for a pivot scenario.
You can always pad your ROP chains with ROP nops, which are simply gadgets that point to ret's.
'ret2libc' is a technique of ROP where you return to functions in standard libraries (libc), rather than using gadgets.
Game consoles are among the most secure off the shelf products consumers can buy, so it's interesting to look at the technical aspects of the exploits and bugs that cracked them open.
Fewer bugs != more secure. One bug is still enough to blow things wide open.
SELinux (Security Enhanced) is an implementation of mandatory access controls (MAC) on Linux. Mandatory access controls allow an administrator of a system to define how applications and users can access different resources such as files, devices, networks and inter-process communication.
Updates don't always patch bugs, sometimes they introduce them.
ASLR is the second big pillar in modern exploit mitigation technologies. It's designed to mitigate exploits that rely on hard coded code/stack/heap addressed by randomizing the layout of memory for every execution.
Address Space Layout Randomization (ASLR)--An exploit mitigation technology used to ensure that address ranges for important memory segments are random for every exection.
A simple stack smash may get you control of EIP, but what does it matter if you have no idea where you can go with it.
Reminder: Security is rapidly evolving.
Position Independent Executable--Executables compiled such taht their base address does not matter, 'position independent code'.
An info leak is when you can extract meaningful information (such as a memory address) from the ASLR protected service or binary.
If you can leak any sort of pointer to code during your exploit, you have likely defeated ASLR.
Given a single pointer into a memory segment, and you can compute the location of everything around it.
Info leaks are the most used ASLR bypass in real world exploitation as they give assurances.
Don't bother to try brute-forcing addresses on a 64-bit machine of any kind.
Ubuntu ASLR is rather weak, low-entropy.
Like other mitigation technologies, ASLR is a "tack on" solution that only makes things ahrder.
The vulnerabilities and exploits become both more complex and precise the deeper down the rabbit hole we go.
DEP & ASLR are the two main pillars of modern exploit mitigation technologies.
Many exploits found in the wild today likely touch on the heap in some form. As stack based memory corruption has grown harder to utilize, the bug hunt has continued into the heap space and brought rise to new classes of vulnerabilities and techniques.
The heap is a pool of memory used for dynamic allocations at runtime.
Everyone uses the heap (dynamic memory) but few usually know much about its internals.
Heap grows down towards higher memory.
Stack grows up towards lower memory.
Buffer overflows are basically the same on the heap as they are on the stack.
Heap cookies/canaries aren't a thing because there are no 'return' addresses to protect.
In the real world, lots of cool and complex things like objects/structs end up on the heap.
It's common to put function pointers in structs which generally are malloc'd on the heap.
Use After Free (UAF)--A class of vulnerability where date on the heap is freed, but a leftover reference or 'dangling pointer' is used by the code as if the data were still valid.
To exploit a UAF, you usually have to allocate a different type of object over the one you just freed.
UAF vulnerabilities don't require any memory corruption to use.
Heap Spraying--A technique used to increase exploit reliability, by filling the heap with large chunks of data relevant to the exploit you're trying to land.
A heap spray is not a vulnerability or security flaw.
Usually heap sprays are done in something like JavaScript placed on a malicious HTML page.
Metadata corruption based exploits involve corrupting heap metadata in such a way that you can use the allocator's internal functions to cause a controlled write of some sort.
Heap metadata corruption based exploits are usually very involved and require more intimate knowledge of heap internals.
Metadata exploits are hard to pull off nowadays as heaps are fairly hardened.
A signed integer can be interpreted as positive or negative.
An unsigned integer is only ever zero and up.
A signed int uses the top bit to specify if it is a positive or negative number.
Variable types are known at compile time, so signed instructions are compiled in to handle your variable.
It's very common to see modern bugs stem from integer confusion and misuse.
In Linux, everything is a file.
A stack canary is a random integer, pushed onto the stack after certain triggers are pushed, and popped off stack and checked before the trigger is read from.
C++ adds a number of conveniences that C lacks. Some of these additions help mitigate common exploitation avenues that we are used to such as string mishandling.
Use of C++ std::string removes a lot of potential memory corruption introduced by C-style strings.
Inheritance introduces (non-standard) complexity.
VTables enable polymorphism.
Kernel exploitation is the process of attacking the operating system itself. Vulnerabilities in the kernel can result in full takeover of a system and are among the most powerful bugs we can find.
Userspace is an abstraction that runs "on top" of the kernel.
The kernel is the core of the operating system.
"jail breaking" or "rooting" devices often depends on finding and leveraging kernel bugs.
Your kernel is:

Managing your processes
Managing your memory
Coordinating your hardware

The kernel is typically the most powerful place we can find bugs.
Basic Kernel Exploit strategy:

Find vulnerability in kernel code.
Manipulate it to gain code execution.
Elevate our process's privilege level.
Survive the "trip" back to userland.
Enjoy our root privileges.

Kernel vulnerabilities are almost exactly the same as userland vulnerabilities.
The most common place to find vulnerabilities is inside of Loadable Kernel Modules (LKMs).
LKMs are like executables that run in Kernel Space.
Remember: the kernel manages running processes.
Most useful things we want to do are much easier from userland.
For exploitation, the easiest strategy is hijacking execution, and letting the kernel reutrn by itself.
Kernel exploitation is weird, but extremely powerful.
x86 is like the wild west in computing--it's like it was designed to be exploited.
We're well into the 64-bit era at this point with 32-bit x86 machines slowly on their way out.
64-bit addresses almost always have a NULL upper byte, meaning ROP chains and string functions don't get along.
WinDBG is Microsoft's debugger.
Syscall numbers tend to change from version to version of Windows and would be hard or unreliable to code into an exploit.
Windows XP SP2 marked the start of the modern security era.
Structured Exception Handling is a lot like assigning signal handlers on Linux.
Exception records are placed on the stack, so they're relatively easy to corrupt.
Windows based exploitation isn't too different from Linux, but it's quickly getting harder.
Systems and applications will never be perfectly secure. Period. They just have to be hard enough to break that nobody can afford it anymore.
The entry bar for binary exploitation is rising faster and faster.
Implementation & logic flaws will probably always exist--you can't really fix stupid.
Source code analyzers can help find bugs statically, but they can also miss a lot.
Fuzzing--the act of mangling data and throwing it at a target application to see if ti mishandles it in some fashion.
Fuzzing has probably been the source of over 95% of the bugs from the past 10 years.
The fuzzing era is starting to wind down.
American Fuzzy Lop (AFL)--A 'security-oriented' fuzzer that inserts and utilizes instrumentation that it inserts at compile time.
Many modern bugs have to be 'forced' by requiring very specific conditions--like some sort of crazy edge cases.

20180217

The Art of Insight in Science and Engineering: Mastering Complexity by Sanjoy Mahajan

Science and engineering, our modern ways of understanding and altering the world, are said to be about accuracy and precision. Yet we best master the complexity of our world by cultivating insight rather than precision.
We need insight because our minds are but a small part of the world.
An insight unifies fragments of knowledge into a compact picture that fits in our minds. But precision can overflow our mental registers, washing away the understanding brought by insight.
There are two broad ways to master complexity: organize the complexity or discard it.
To master complexity use:

divide and conquer
abstraction
symmetry and conservation
proportional reasoining
dimensional analysis
lumping
probabilistic reasoning
easy cases
spring models

The most effective teacher is a skilled tutor. A tutor asks many questions, because questioning, wondering, and discussing promote learning.
We cannot find much insight staring at a mess. We need to organize it.
In problem-solving, we organize complexity by using divide-and-conquer reasoning and by making abstractions.
Reliability comes from intelligent redundancy.
Divide-and-conquer estimates require reasonable estimates for the leaf quantities.
The main lesson that you should take away is courage: No problem is too difficult. We just use divide-and-conquer reasoning to dissolve difficult problems into smaller pieces.
Naming--or, more technically, abstraction--is our other tool for organizing complexity.
A name or an abstraction gets its power from its reusability. Without reusable ideas, the world would become unmanageable complicated.
Notations are abstractions, and good abstractions amplify our intelligence.
Our understanding of the world is built on layers of abstractions.
The benefit of the abstraction solution, compared to calculating [...] explicitly, is insight.
Abstraction has a second benefit: giving us a high-level view of a problem or situation. Abstractions then show us structural similarities between seemingly disparate situations.
The key is to practice effectively.
Because abstractions are so useful, it is helpful to have methods for making them. One way is to construct an analogy between two systems. Each common feature leads to an abstraction; each abstraction connects our knowledge in one system to our knowledge in the other system.
Analogies not only reuse work, they help us rewrite expressions in compact, insightful forms.
A good notation should help thinking, not hinder it by requiring us to remember how the notation works.
Once you name an idea, you find it everywhere.
An abstraction connects seemingly random knowledge an insights. By building abstractions, we amplify our intelligence.
Whenever you reuse an idea, identify the transferable process and name it: make an abstraction. With a name you will recognize and reuse it.
We use symmetry and conservation whenever we find a quantity that, despite the surrounding complexity, does not change. This conserved quantity is called an invariant. Finding invariants simplifies many problems.
When there is change, look for what does not change!
Invariants are powerful partly because they are abstractions.
Logarithmic scales can make otherwise obscure symbolic calculations intuitive.
Drag, one of the most difficult subjects in physics, is also one of the most important forces in everyday life.
High accuracy often requires analyzing and tracking many physical effects. The calculations and bookkeeping can easily obscure the most important effect and its core idea, costing us insight and understanding.
In the midst of change, find what does not change--the invariant or conserved quantity. Finding these quantities simplifies problems: We focus on the few quantities that do not change rather than on the many ways in which quantities do change. An instance of this idea with wide application is a box model, where what goes in must come out.
Proportionalities are often called scaling relations.
Scaling exponents are a powerful abstraction: once you know the scaling exponent, you usually do not care about the mechanism underlying it.
Scaling relations bootstrap our knowledge.
Dive and conquer: don't bite off all the complexity at once!
Proportional reasoning focuses our attention on how one quantity determines another. By guiding us toward what is often the most important characteristic of a problem, the scaling exponent, it helps us discard spurious complexity.
Make only dimensionless comparisons.
A quantity with dimensions is, by itself, meaningless. It acquires meaning only when compared with a relevant quantity that has the same dimensions.
Because dimensionless quantities are the only meaningful quantities, we can understand the world better by describing it in terms of dimensionless quantities.
Using quantum mechanics, we can predict the properties of atoms in great detail--but the analysis involves complicated mathematics that buries the core ideas. By using dimensional analysis, we can keep the core ideas in sight.
Dimensional analysis discards complexity without loss of information.
When the going gets tough, the tough lower their standards: approximate first, and worry later.
Asking calculators to do simple arithmetic dulls our ability to navigate the quantitative world. The antidote is to do the computations ourselves, but approximately--by placing quantities on a logarithmic scale and rounding them to the nearest convenient value.
The simplest method of round is to round every number to the nearest power of ten. That simplification turns most calculations into adding and subtracting integer exponents.
"Nearest" is judged on a logarithmic scale, where distance is measured not with differences but with ratios or factors.
Rounding to the nearest power of ten gives a quick, preliminary estimate. When it is too approximate, we just round more precisely. The next increase in accuracy is to round to the nearest half power of ten.
Lumping not only simplifies numbers, where it is called rounding, it also simplifies complex quantities by creating an abstraction: the typical or characteristic value.
Using typical or characteristic values allows us to reason out seemingly impossible questions while sitting in our armchairs.
Proportional reasoning reduces complexity by showing us a notation for ignoring quantities that do not vary.
Lumping rescues us by replacing changing values with a single, constant, typical value--making the relations amenable to proportional reasoning.
A powerful form of lumping is to replace complex shapes by a comparably sized cube.
In everyday life, an important feature of fluid flow is drag.
The essential physical idea is that the viscous force, a force from a neighboring region of fluid, slows down fast pieces of fluid and speeds up slow pieces.
Lumping smooths out variation.
Lumping replaces a complex, changing process with a simpler, constant process.
The antidote to complicated integrals is lumping.
By using lumping to introduce quantum mechanics, we will gain a physical intuition for the effect of quantum mechanics.
In mechanics, the simplest useful model is motion in a straight line at constant acceleration. In quantum mechanics, the simplest useful model is a particle confined to a box.
Lumping is our first tool for discarding complexity with loss of information. By doing so, it simplifies complicated problems where our previous set of tools could not. Curves become straight lines, calculus becomes algebra, and even quantum mechanics becomes comprehensible.
Probabilistic reasoning helps us when our information is already incomplete--when we've discarded even the chance or the wish to collect the mission information.
The essential concept in using probability to simplify the world is that probability is a degree of belief. Therefore, a probability is based on our knowledge, and it changes when our knowledge changes.
The Bayesian interpretation is based on one simple idea: a probability reflects our degree of belief in a hypothesis. Probabilities are therefore subjective: someone with different knowledge will have different probabilities. Thus, by collecting evidence, our degrees of belief change. Evidence changes probabilities.
Random walks are everywhere.
In large complex systems, the information is either overwhelming or not available. Then we have to reason with incomplete information. The tool for this is probabilistic reasoning--in particular, Bayesian probability.
Probabilistic reasoning helps us manage incomplete information.
A correct analysis works in all cases--including the simplest. This principle is the basis of our next tool for discarding complexity: the method of easy cases.
Easy-cases reasoning is a way of introducing physical knowledge.
Special relativity is Einstein's theory of motion. It unifies classical mechanics and classical electrodynamics (the theory of radiation), giving a special role to the speed of light c.
The speed of light is the universe's speed limit, and special relativity obeys it.
The Heisenberg uncertainty principle restricts how small we can make these uncertainties, and therefore how accurately we can determine the position and momentum [of a particle].
Look at the easy cases first. Often, we can completely solve a problem simply by understanding the easy cases.
Our final tool for mastering complexity is making spring models. The essential characteristics of an ideal spring, the transferable abstractions, are that it produces a restoring force proportional to the displacement from equilibrium and stores an energy proportional to the displacement squared. These exceedingly specific requirements are met far more widely than we might expect.
Sunlight is an oscillating electric field.
Many physical processes contain a minimum-energy state where small deviations from the minimum require an energy proportional to the square of the deviation. This behavior is the essential characteristic of a spring. A spring is therefore not only a physical object but a transferable abstraction.
For long-lasting learning, the pieces of knowledge should support each other through their connections. For when we remember a fact or use an idea, we activate connected facts and ideas and solidify them in our minds.
So, for long-lasting learning and understanding, make bonds: connect each new fact and idea to what you already know. This way of thinking will help you learn in one year what took me two or twenty.
Use your reasoning tools to weave a richly connected, durable tapestry of knowledge.

20180216

TALENT IS OVERRATED by Geoff Colvin

Extensive research in a wide range of fields shows that many people not only fail to become outstandingly good at what they do, no matter how many years they spend doing it, they frequently don’t even get any better than they were when they started.
Deliberate practice is hard. It hurts. But it works. More of it equals better performance. Tons of it equals great performance.
Most organizations are terrible at applying the principles of great performance. Many companies seem arranged almost perfectly to prevent people from taking advantage of these principles for themselves or for the teams in which they work.
Contemporary athletes are superior not because they’re somehow different but because they train themselves more effectively. That’s an important concept for us to remember.
The pressure on us to keep getting better is greater than it used to be because of a historic change in the economy.
Companies of all kinds have far more money than they need. The cash held by U.S. companies is hitting all-time records. Companies are using some of this money to buy back their own stock at record rates. When a company does this, it’s saying to its investors: We don’t have any good ideas for what to do with this, so here—maybe you do.
For virtually every company, the scarce resource today is human ability.
Processing information and moving it around costs practically nothing. For those same reasons, offshoring of manufacturing jobs is also exploding.
The costs of being less than truly world class are growing, as are the rewards of being genuinely great.
Being good at whatever we want to do—playing the violin, running a race, painting a picture, leading a group of people—is among the deepest sources of fulfillment we will ever know. Most of what we want to do is hard.
In fact, the overwhelming impression that comes from examining the early lives of business greats is just the opposite—that they didn’t seem to hold any identifiable gift or give any early indication of what they would become.
A wide range of research shows that the correlations between IQ and achievement aren’t nearly as strong as the data on broad averages would suggest, and in many cases there’s no correlation at all.
the research tells us that intelligence as we usually think of it—a high IQ—is not a prerequisite to extraordinary achievement.
Practice is so hard that doing a lot of it requires people to arrange their lives in particular ways.
Yes, more total practice is very powerfully associated with better performance.
What the authors called “deliberate practice” makes all the difference.
“the differences between expert performers and normal adults reflect a life-long period of deliberate effort to improve performance in a specific domain.”
Deliberate practice is characterized by several elements, each worth examining. It is activity designed specifically to improve performance, often with a teacher’s help; it can be repeated a lot; feedback on results is continuously available; it’s highly demanding mentally, whether the activity is purely intellectual, such as chess or business-related activities, or heavily physical, such as sports; and it isn’t much fun.
Decades or centuries of study have produced a body of knowledge about how performance is developed and improved, and full-time teachers generally possess that knowledge.
deliberate practice requires that one identify certain sharply defined elements of performance that need to be improved, and then work intently on them.
The great performers isolate remarkably specific aspects of what they do and focus on just those things until they are improved; then it’s on to the next aspect.
Identifying the learning zone, which is not simple, and then forcing oneself to stay continually in it as it changes, which is even harder—these are the first and most important characteristics of deliberate practice.
High repetition is the most important difference between deliberate practice of a task and performing the task for real, when it counts.
Repeating a specific activity over and over is what most of us mean by practice, yet for most of us it isn’t especially effective.
Top performers repeat their practice activities to stultifying extent.
More generally, the most effective deliberate practice activities are those that can be repeated at high volume.
You can work on technique all you like, but if you can’t see the effects, two things will happen: You won’t get any better, and you’ll stop caring.
Deliberate practice is above all an effort of focus and concentration.
Doing things we know how to do well is enjoyable, and that’s exactly the opposite of what deliberate practice demands. Instead of doing what we’re good at, we insistently seek out what we’re not good at. Then we identify the painful, difficult activities that will make us better and do those things over and over.
The reality that deliberate practice is hard can even be seen as good news. It means that most people won’t do it. So your willingness to do it will distinguish you all the more.
Most fundamentally, what we generally do at work is directly opposed to the first principle: It isn’t designed by anyone to make us better at anything.
Deliberate practice does not fully explain achievement—real life is too complicated for that.
While it has often been observed that those who work the hardest seem to be the luckiest, the fact remains that if a bridge collapses while you’re driving over it, nothing else matters.
Measuring the intensity of practice may be difficult, but it’s clearly significant.
Frequently when we see great performers doing what they do, it strikes us that they’ve practiced for so long, and done it so many times, they can just do it automatically. But in fact, what they have achieved is the ability to avoid doing it automatically.
When we learn to do anything new—how to drive, for example—we go through three stages. The first stage demands a lot of attention as we try out the controls, learn the rules of driving, and so on. In the second stage we begin to coordinate our knowledge, linking movements together and more fluidly combining our actions with our knowledge of the car, the situation, and the rules. In the third stage we drive the car with barely a thought. It’s automatic. And with that our improvement at driving slows dramatically, eventually stopping completely.
The essence of practice, which is constantly trying to do the things one cannot do comfortably, makes automatic behavior impossible.
Many years of intensive deliberate practice actually change the body and the brain.
Top professionals do indeed have very fast reaction times, and reaction speed can be improved with practice, so professionals work on it. The problem is that improvements in reaction speed follow what scientists call a power law (because there’s an exponent in the formula) and what the rest of us call the 80-20 rule. That is, nearly all the improvement comes in the first little bit of training. After that, lots more practice yields only a little additional improvement.
Sometimes excellent performers see more by developing better and faster understanding of what they see.
More generally in business and other fields, nonobvious indicators may be so valuable that most of us never know about them.
In general, regardless of whether indicators are secret, developing and using them requires extensive practice.
Most of the indicators used by top performers require practice to be of any use.
Getting information pushes at the two constraints everyone faces: It takes time and costs money. Making sound decisions fast and at low cost is a competitive advantage everywhere.
Deliberate practice works by helping us acquire the specific abilities we need to excel in a given field.
Building and developing knowledge is one of the things that deliberate practice accomplishes.
Researchers find that excellent performers in most fields exhibit superior memory of information in their fields.
The capacity of short-term memory doesn’t seem to vary much from person to person; virtually everyone’s short-term memory falls in the range of five to nine items.
Top performers understand their field at a higher level than average performers do, and thus have a superior structure for remembering information about it.
The brain’s ability to change is greatest in youth, but it doesn’t end there.
fundamentals of great performance are mainly unrecognized or ignored.
Step one, obvious yet deserving a moment’s consideration, is knowing what you want to do. The key word is not what, but knowing.
The first challenge in designing a system of deliberate practice is identifying the immediate next steps.
The skills and abilities one can choose to develop are infinite, but the opportunities to practice them fall into two general categories: opportunities to practice directly, apart from the actual use of the skill or ability, the way a musician practices a piece before performing it; and opportunities to practice as part of the work itself.
Remember, feedback is crucial to effective practice, and people have a tendency to misremember what they thought in the past; we almost always adjust our recollections flatteringly, in light of how events actually turned out. But there’s no escaping a written record.
Opportunities to practice business skills directly are far more available than we usually realize, but even these aren’t the only opportunities.
Effective self-regulation is something you do before, during, and after the work activity itself.
Self-regulation begins with setting goals. These are not big, life-directing goals, but instead are more immediate goals for what you’re going to be doing today.
The best performers set goals that are not about the outcome but about the process of reaching the outcome.
But within that activity, the best performers are focused on how they can get better at some specific element of the work,
An important part of prework self-regulation centers on attitudes and beliefs.
The best performers go into their work with a powerful belief in what researchers call their self-efficacy—their ability to perform. They also believe strongly that all their work will pay off for them.
The most important self-regulatory skill that top performers use during their work is self-observation.
The best performers observe themselves closely. They are in effect able to step outside themselves, monitor what is happening in their own minds, and ask how it’s going. Researchers call this metacognition—knowledge about your own knowledge, thinking about your own thinking. Top performers do this much more systematically than others do; it’s an established part of their routine.
Metacognition is important because situations change as they play out.
Practice activities are worthless without useful feedback about the results. Similarly, the practice opportunities that we find in work won’t do any good if we don’t evaluate them afterward.
Excellent performers judge themselves differently from the way other people do. They’re more specific, just as they are when they set goals and strategies.
A critical part of self-evaluation is deciding what caused the errors. Average performers believe their errors were caused by factors outside their control:
Top performers, by contrast, believe they are responsible for their errors. Note that this is not just a difference of personality or attitude.
It’s crazy that in most jobs and at most organizations, there’s little or no explicit education about the nature of the domain.
As you add to your knowledge of your domain, keep in mind that your objective is not just to amass information. You are building a mental model—a picture of how your domain functions as a system. This is one of the defining traits of great performers: They all possess large, highly developed, intricate mental models of their domains.
A mental model forms the framework on which you hang your growing knowledge of your domain.
A mental model not only enables remarkable recall, it also helps top performers learn and understand new information better than average performers, since they see it not as an isolated bit of data but as part of a large and comprehensible picture.
A mental model helps you distinguish relevant information from irrelevant information.
Most important, a mental model enables you to project what will happen next.
Since your mental model is an understanding of how your domain functions as a system, you know how changes in the system’s inputs will affect the outputs—that is, how the events that just happened will create the events that are about to happen.
A mental model is never finished. Great performers not only possess highly developed mental models, they are also always expanding and revising those models. It isn’t possible to do the whole job through study alone.
No matter how many steps on the road to great performance you choose to take, you will be better off than if you hadn’t taken them.
Few do it well, and most don’t do it at all; the sooner you start, the better.
Not all organizations want to be great. That’s the hard truth. For those that do—that really do—the principles of great performance show quite clearly what it takes to get there.
Understand that each person in the organization is not just doing a job, but is also being stretched and grown.
Organizations tend to assign people based on what they’re already good at, not what they need to work on.
Deliberately putting managers into stretch jobs that will require them to learn and grow is the central development technique of the most successful organizations.
Executives consistently report that their hardest experiences, the stretches that most challenged them, were the most helpful.
Find ways to develop leaders within their jobs.
We’ve seen that great performance is built through activities that are designed specifically to improve particular skills, and that in many realms teachers and coaches are especially helpful in designing those activities.
Most organizations are terrible at providing honest feedback.
Identify promising performers early.
Understand that people development works best through inspiration, not authority.
Deliberate practice activities are so demanding that no one can sustain them for long without strong motivation.
Invest significant time, money, and energy in developing people.
Make leadership development part of the culture.
Developing leaders isn’t a program, it’s a way of living.
Develop teams, not just individuals.
Organizations that are the most successful at building team performance are especially skilled at avoiding or addressing potential problems that are particularly toxic to the elements of deliberate practice,
Trust is the most fundamental element of a winning team.
Just as great individual performers possess highly developed mental models of their domains, the best teams are composed of members who share a mental model—of the domain, and of how the team will be effective.
Applying the principles of great performance in an organization is no easier than doing anything else in an organization. It’s hard.
The effects of deliberate practice activities are cumulative.
In the digital age, any products that can be compared will be compared, and any directly comparable products will be commoditized.
A product unlike any other can’t be commoditized. A service that reaches deep into the psyche of the buyer can never be purchased solely on price. Creating such products and services was always valuable; now it’s essential.
As products and services live shorter lives, so do the business models of the companies that sell them.
Creativity and innovation have always been important; what’s new is that they’re becoming economically more valuable by the day.
The evidence underlying the principles of deliberate practice and great performance shows that in finding creative solutions to problems, knowledge—the more the better—is your friend, not your enemy. And it shows that creativity isn’t a lightning bolt.
The greatest innovators in a wide range of fields—business, science, painting, music—all have at least one characteristic in common: They spent many years in intensive preparation before making any kind of creative breakthrough. Creative achievement never came suddenly, even in those cases in which the creator later claimed that it did.
Great innovations are roses that bloom after long and careful cultivation.
As for what exactly is going on during those long periods of preparation, it looks a lot like the acquisition of domain knowledge that takes place during deliberate practice. It is certainly intensive and deep immersion in the domain, frequently under the direction of a teacher, but even when not, the innovator seems driven to learn as much as possible about the domain, to improve, to drive himself or herself beyond personal limits and eventually beyond the limits of the field.
The most eminent creators are consistently those who have immersed themselves utterly in their chosen field, have devoted their lives to it, amassed tremendous knowledge of it, and continually pushed themselves to the front of it.
“The idea of epiphany is a dreamer’s paradise where people want to believe that things are easier than they are.”
One of the main reasons why the people in organizations don’t produce more innovation is that the culture isn’t friendly to it. New ideas aren’t really welcomed. Risk taking isn’t embraced.
Culture change starts at the top.
People who are internally driven to create do seem more creative than those who are just doing it for the money.
The heavy burden of the evidence is that creativity is much more available to us than we tend to think.
No one becomes extraordinary on his or her own, and a striking feature in the lives of great performers is the valuable support they received at critical times in their development.
Cultures encourage or discourage specific pursuits at different times.
The greatest value of a supporting home environment is that it enables a person to start developing early.
Starting early holds advantages that become less available later in life.
As we have seen repeatedly, becoming world-class great at anything seems to require thousands of hours of focused, deliberate practice.
Most organizations are not intellectually stimulating, even when the field itself might seem fascinating; rather than offering opportunities to learn and rewarding curiosity, the typical organization leaves inquisitive employees to find their own ways to learn.
The fundamental reason why there are no teenage prodigies in certain domains is that it’s impossible to accumulate enough development time by the teenage years.
One of the best established and least surprising findings in psychology is that as we age, we slow down.
Somehow, excellent performers manage to continue achieving at high levels well beyond the point where age-related declines would seem to make that impossible.
Studies in a very broad range of domains—management, aircraft piloting, music, bridge, and others—show consistently that excellent performers suffer the same age-related declines in speed and general cognitive abilities as everyone else—except in their field of expertise.
In general, well-designed practice, pursued for enough time, enables a person to circumvent the limitations that would otherwise hold back his or her performance, and circumventing limitations is the key to high performance at an advanced age.
Our brains are perfectly able to add new neurons well into old age when conditions demand it, and brain plasticity doesn’t stop with age.
Most people stop the deliberate practice necessary to sustain their performance.
Eventually, of course, everyone’s performance declines.
As people master tasks, they must seek greater challenges and match them with higher-level skills in order to keep experiencing flow.
As for rewards, at most companies they almost always entail more responsibilities and less freedom. Extra responsibilities are always part of rising higher in an organization, but if they don’t come with the potential for more self-direction, the promotion will feel more like a burden than a reward.
The weight of the evidence is that the drive to persist in the difficult job of improving, especially in adults, comes mostly from inside.
In domains where building the knowledge foundation takes many years before specific domain-related work can begin, such as business and high-level science, we commonly see that future stars may be decidedly undriven even as young adults.
A very small advantage in some field can spark a series of events that produce far larger advantages.
A similar way to ignite the multiplier effect is to begin learning skills in a place where competition is sparse.
Becoming a great performer demands the largest investment you will ever make—many years of your life devoted utterly to your goal—and only someone who wants to reach that goal with extraordinary power can make it.
Everyone who has achieved exceptional performance has encountered terrible difficulties along the way. There are no exceptions.

20180215

The Elon Musk Post Series by Wait But Why

Like you often read in the bios of extraordinary people, he [Elon Musk] was an avid self-learner early on.
His brother Kimbal has said Elon would often read for 10 hours a day.
One thing you'll learn about Musk as you read these posts is that he things of humans as computers, which, in their most literal sense, they are.
A human's hardware is his physical body and brain. His software is the way he thinks, his value system, his habits, his personality.
Learning, for Musk, is simply the process of "downloading data and algorithms into your brain".
There are a few people in each generation who dramatically change the world, and those people are worth studying. They do things differently from everyone else--and I think there's a lot to learn from them.
Musk is a smart motherfucker, and he knows a ton about AI, and his sincere concern about this makes me scared.
I've hear people compare knowledge of a topic to a tree. If you don't fully get it, it's like a tree in your head with no trunk--and without a trunk, when you learn something new about the topic--a new branch or leaf of the tree--there's nothing for it to hang onto, so it just falls away. By clearing out fog all the way to the bottom, I build a tree trunk in my head, and from then on, all new information can hold on, which makes that topic forever more interesting and productive to learn about.
Let's call energy "the thing that lets something do stuff".
But the tricky thing about energy is the law of conservation of energy, which says that energy can't be created or destroyed, only transferred or transformed from one form to another. And since every living thing needs energy in order to do stuff--and you can't make your own energy--we're all awkwardly left with no choice but to steal the energy we need from someone else.
Almost all of the energy used by the Earth's living things got to us in the first place from the sun.
Plants know how to take the sun's joules [energy] and turn them into food. At that point, all hell breaks loose as everyone starts murdering everyone else so they can steal their joules.
We use "the food chain" as a cute euphemism for this murder/theft cycle [of energy], and we use the word "eating" to refer to "stealing someone else's joules and also murdering them too".
There are joules floating and swirling and zooming all around us, and by inventing the concept of technology, humans figured out ways to get use out of them.
The most exciting joule-stealing technology humans came up with was figuring out how to burn something.
Fire was a hectic dragon and no one had figured out how to grab its reins. And then came the breakthrough. Steam.
When you see power lines on the street, all they're doing is delivering joules of a far-away fire to people's homes.
Fossil fuels are called fossil fuels because they're the remains of ancient living things.
The largest portion of our fossil fuels comes from plants, animals, and algae that lived during the Carboniferous Period--a 50 million year period that ended about 300 million years ago and during which there were lots of huge, shallow swamps. The swamps were important because it made it more likely that a dead organism would be preserved.
The US is to coal, as Saudi Arabia is to oil, possessing 22% of the world's coal, the most of any nation. China, though, has become by far the world's largest consumer of coal--over half of the coal burned in the world in recent years was burned in China.
The United States is by far the biggest consumer of oil in the world, consuming over 20% of the world's oil and about double the next biggest consumer.
The US is also one of the three biggest oil producers in the world, alongside Saudi Arabia and Russia, who all produce roughly the same amount.
Climate change is a thing.
Fact: Burning fossil fuels makes atmospheric CO2 levels rise.
Combustion is reverse photosynthesis.
When a plant grows, it makes its own food through photosynthesis. At its most oversimplified, during photosynthesis, the plant takes CO2 from the air and absorbs light energy from the sun to split the CO2 into carbon (C) and oxygen (O2). The plant keeps the carbon and emits the oxygen as a waste product. The sun's light energy stays in the plant as chemical energy the plant can use.
Photosynthesis just kidnaps carbon and sun energy out of the atmosphere, and after years of holding them hostage, combustion sets them both free--the carbon as a billowing eruption of newly reunited CO2, and the sun energy as fire--meaning that fire is essentially just tightly packed sunshine.
Carbon flows from the atmosphere into plants and animals, into the ground and water, and then back out of all those things into the atmosphere--that's called the carbon cycle. At any given point in time, the Earth's active carbon cycle contains a specific amount of carbon. Burning a log doesn't change that level because the carbon cycle "expects" that carbon to be hanging around the ground, water, or air.
Fact: CO2 levels are rising quickly.
Fact: Where atmospheric CO2 levels go, temperatures follow.
CO2 is a greenhouse gas.
The way an actual greenhouse works is the glass lets in sun energy and traps a lot of it inside as heat. There are a handful of chemicals in our atmosphere that do the same thing--sun rays come in, bounce off the Earth, and they're on their way out when the greenhouse gases in the atmosphere block some of them and spread them through the atmosphere, warming things up.
Fact: The temperature doesn't need to change very much to make everything shitty.
You don't need the average temperature to go up by a catastrophic amount to have a catastrophe--because the average temp could go up by only 3 degrees C, but the max temp rises by a lot more.
Just one day at an outlier high like 58 degrees C (136 F) would wipe out most of the Earth's crops and animals.
Because water vapor is a greenhouse gas, heavy evaporation causes a runaway greenhouse effect starting at this temperature (70 C).
Burning fossil fuels makes everything shitty.
If we continue to burn fossil fuels as much as we are, things might get really shitty kind of soon.
Fossil fuels are endful.
The problem with running out [of fossil fuels], whenever it happens, is that if the world is anywhere near as reliant on fossil fuels at that point as we are now, it'll cause an epic economic collapse.
At some point in the future, either really soon or just a little soon, we'll have no choice but to stop running everything on fossil fuels, because they'll either be gone or too expensive.
There's potentially huge long-term downside to staying in the [fossil fuel era] for too long, so let's just get ourselves to the [sustainable energy era] as soon as we can.
94% of the world's transportation runs on oil, and in most developed countries, the percentage is even higher.
China is an energy monster, mostly because they're an industrial monster. They're also a coal-burning beast, burning through almost half of the world's total coal consumption each year.
The US has become a natural gas consumption beast and by far the biggest one in the world.
Electricity production is huge and mostly dirty.
Transportation is huge and almost entirely dirty.
With a steam engine, the fire burns outside the engine and heats steam inside the engine to make it work. So it's an external combustion engine. An internal combustion engine cuts out the steam and burns the fuel inside the engine itself to generate power.
"Ideal" wasn't the driving force of the early auto industry--scalable was.
"Hot explosions in cylinders pushing pistons back and forth to force metal bars to turn wheels and sending the resulting smoke billowing out the pipe" sounds like an old-fashioned technology, and it's just very odd that we're still using it today.
The problem with the question "Why did X technology stop moving forward?" is that it's misunderstanding how progress works. Instead of asking why technological progress sometimes stops, we have to ask the question: Why does technological progress ever happen at all?
Natural selection doesn't make things "better"--it just optimizes biology to best survive in whatever environmental circumstances it finds itself.
IN order for government funding to lead to major progress, there has to be a lot of it, and in an open democracy, that only flies when the nation needs to do something so important that everyone agrees on it.
The way capitalism theoretically works is that the more real-world value you create, the more money you'll make.
Greed can lead to steady forward progress, but in order for progress to leap forward, a second ingredient is usually key: a burning desire to do something great.
Greedy companies will make their decisions based on whatever the best way is to optimize to their environment--i.e. How can we make the most money?
Greed is a simple motivation--it takes whatever it can get, and it'll push all available limits it can in order to fully optimize.
In the auto industry, CO2 emissions are a negative externality. If you have a cheap and easy way to build cars that dump garbage into the atmosphere and no one makes you pay for it, why would you ever change anything?
The problem is, giant companies have enough influence that any government attempt at making changes through regulation ends up being watered down to the point where it's ineffective.
There are two types of electric motors--the AC induction motor and the brush-less DC electric motor.
A gas engine is a lot more complicated than an electric motor, with over 200 moving parts; an electric motor has fewer than 10.
Electric motors are more convenient than gas engines most of the time.
It costs a lot less to power an electric motor than a gas engine.
The gas engine is one of the two major causes of the energy/climate crisis.
The electric motor is clearly the easier, cheaper, and more sustainable long-term plan for powering cars.
Tesla business plan:

1) High-priced low-volume car for the super rich.
2) Mid-priced mid-volume car for the pretty rich.
3) Low-priced high-volume car for the masses.

The CEO receives the distillation of all the worst problems in the company, only spending time on the things that are going wrong, and you get all the stuff other people can't take care of, so you have a filter for the crappiest problems in the company.
The product is successful when it's great, and the company becomes great because of that.
The moment the person leading the company thinks numbers have value in themselves, the company's done. The moment the CFO becomes the CEO--it's done.
Over time, big industries tend to get flabby and uncreative and risk-averse--and if the right outsider company has the means and creativity to come at the industry with a fresh perspective and rethink the whole thing, there's often a huge opportunity there.
It's a rule of thumb in the car world that every $5,000 decrease in car price approximately doubles the number of buyers who can afford the car.
Big problems call for big solutions.
Dealerships make a huge amount of profit fixing gas engines, oil filters, and doing oil changes--money they'd stop making when they sold EVs with motors that rarely broke.
The issue with Tesla right now is most people can't afford one, and the issue with every other EV is the range sucks.
The truth is, the typical American drives 37 miles a day on average, and the 80+ mile range options are probably actually plenty for most people. But 80 miles seems like an insufficient range to prospective buyers, and mass adoption won't happen with that kind of range.
The gas era is over and EVs are the obvious, obvious future.
Unlike car companies, the oil industry can't suck it up, get on the EV train, and after an unpleasant hump, continue to thrive. If EVs catch on in a serious way and end up being the ubiquitous type of car, oil companies are ruined.
Giant industries don't just roll over and eat razor blades without a serious fight.
The tactic to stay alive longer is always the same--put out misinformation to create confusion, and make it political so half the country feels like they're going against "their own team" if they side against the industry.
The super-clever way they create confusion is by generating the public perception that there's a genuine debate among scientists.
Tesla's mission is "to accelerate the advent of sustainable transport by bringing compelling mass market electric cars to market as soon as possible". Big Oil's current mission is "to delay the advent of sustainable transport by making people think EVs aren't actually better for the environment than gas cars".
The thing you'll notice, though, is that every time you hear someone all mad about the "long tailpipe" emissions of EVs, they're using words like, "may be" and "often" and "in some cases, some studies show". That's because you have to use words like that when you're saying things that you wish were true but actually aren't.
Energy production is more efficient in a power plant than it is in a car engine.
In a car, burning gas is less than 25% efficient, with the vast majority of the energy lost to heat.
The average new gas car gets 23 MPG. Anything above 30 MPG is really good for a gas car, and anything below 17 is bad.
Because the grid is getting cleaner every year, it means an EV gets cleaner as time goes by.
Driving a gas car is like littering on a camping trip, smoking on an airplane, and throwing a big stack of paper in the trash, and it's just a matter of time until public disgust catches up to it.
Our intuition tells us that technology, social norms, movements and ideas just move forward through time, as if forward progress is a river and those things are on a raft gliding through. We so associate the passage of time with progress that we use the term "the future" to refer to a better, more advanced version of our present world.
In reality, if a more advanced future does happen, it's because that future was willed into our lives by a few brave people. The present isn't welcoming of an advanced future because the present is run by a thick canopy made up of ideas, norms, and technologies of the past.
Our modern world became as advanced as it is not by floating up an inevitable advancement river, but because of a collection of moments over time when a person or company has done something that makes everyone's jaw drop.
Space travel is unbelievable expensive.
The first and primary reason humans have interacted with space since the Apollo program isn't about human interest in space. It's about using space for practical purposes in support of industries on Earth--mostly in the form of satellites.
Low Earth Orbit (LEO) starts up at 99 miles (160 km) above the Earth, the lowest altitude at which an object can orbit without atmospheric drag messing things up. The top of LEO is 1,240 miles (2,000 km) up.
Geostationary orbit (GEO) is right at 22,236 miles (35,786 km) above the Earth, and it's called geostationary because something orbiting in it rotates at the exact speed that the Earth turns, making its position in the sky stationary relative to a point on the Earth. It'll seem to be motionless to an observer on the ground.
Medium-Earth Orbit (MEO) is everything in between LEO and GEO.
At the incredible speeds at which space objects move, a collision with even a tiny object can cause devastating damage to an active satellite or spacecraft.
The official definition of a planet is:

1) Has to orbit the sun.
2) Has to be big enough to become spherical-ish under its own gravity.
3) Has to have cleared out its orbit.

An AU is an "astronomical unit"--the distance from the Earth to the sun--which is about 93 million miles (150 million km).
I'm not sure people realize that there are huge, almost planet-size objects in the asteroid belt.
The only reason any humans have gone to space since Apollo 17 returned to Earth in 1972 is that sometimes, the machines aren't yet advanced enough to do a certain task, so we need to send a human up to do it instead.
A good magic show follows a simple rule--make the act get better as it goes along. If you can't continue to stay a step ahead of the increasingly-jaded crowd, they'll quickly tune you out.
Like the rest of us, Elon Musk has a handful of life goals. Unlike the rest of us, one of those life goals is to put 1,000,000 people on Mars.
Species extinctions are kind of like human deaths--they're happening constantly, at a mild and steady rate. But a mass extinction event is, for species, like a war or a sweeping epidemic is for humans--an unusual event that kills off a large chunk of the population in one blow.
Humans have never experienced a mass extinction event, and if one happened, there's a reasonable chance it would end the human race--either because the event itself would kill us, or the effects of an event would.
The universe is a violent, hostile place and we're a group of fragile organisms living in a delicate balance of precise conditions. We're around, for now, because the universe is currently allowing us to be.
Supernovae, the universe's largest explosions, happen when giant stars die.
Gamma-ray bursts are the universe's brightest events.
Why a million people [on Mars]? Because that's Musk's rough estimate for the minimum number of people it would take to create a completely self-sustaining population.
This concept--making human life multi-planetary in a self-sustaining way--is often called "planetary redundancy".
Mars is basically a colder Antarctica that looks like the Arizona desert with air you can't breathe and a sun that radiates you to death if you're exposed to it for long. Every part of Mars is dramatically less livable than the least livable place on Earth. But the conditions are reasonable enough that with a man-made "hab" to live in, a little greenhouse garden, and a good enough spacesuit, you could actually exist on Mars without dying.
In theory, with enough effort and technology, humans could terraform Mars and sometime down the road have a somewhat pleasant planet to live on, with trees and oceans and no need to wear a spacesuit outside.
An AU is the distance from the sun to the Earth.
Backing up the humanity hard drive is a critical and necessary thin to do at some point--by having all of our eggs in one planet basket, we're leaving ourselves vulnerable to extinction.
Mars is by far the best place to back up the humanity hard drive. But with enough technology, we could create many more backups by colonizing as many as ten or more moons, asteroids, and planets in the Solar System.
The sun is about halfway done with its life.
The further you zoom out, the "bigger" a turn of events has to be in order to remain significant on that scale.
Life history moves far more slowly than human history.
The rate of progress can grow exponentially, because as more progress happens, it enables faster progress to happen, and this starts a cascading chain as progress explodes upwards.
When a species becomes so powerful that they can achieve giant grand-scale life leaps in under a century, they can essentially play god, in many different ways.
With an average of one mass extinction event every 100 million years since animals have been around, we may be currently engineering a sixth one by accident.
Sooner or later something will get us if we stay on one planet.
What we have is a way to go to Mars for an astronomical amount of money. And that's no way to colonize Mars. To Musk, what's missing is a way to go to Mars affordably.
You pay for it [rocket development] by making your research and development operation double as a profitable space delivery service.
All a company is is a bunch of people together to create a product or service. There's no such thing as a business, just pursuit of a goal--a group of people pursuing a goal.
No assholes.
Musk says that if you hate your colleagues or boss, you won't want to come to work and stay for long hours.
Hire (and promote) based on raw talent, not experience.
Most of today's companies avoid taking on the massive scope vertical integration requires, but for a quality control freak, like Musk or Jobs, it's the only way they'd have it.
Almost every person I talked to at both Tesla and SpaceX emphasized how much of an expert Musk is at their particular field, whether that field be car batteries, car design, electric motors, rocket structures, rocket engines, rocket electronics ("avionics"), or aerospace engineering. He can do this because of a combination of his immensely thick tree trunk of fundamental understanding of physics and engineering and his genius-level ability to retain information as he learns it.
It's that insane breadth of expertise that allows Musk to maintain such an abnormally high level of control over everything that happens at his companies.
The rocket is the main big thing that that launches, and it has one job: to carry the payload and its container up through the atmosphere and put it into space. Most of the rocket is a big fuel tank, and at the bottom of a rocket is one or more insanely powerful, bell-shaped engines.
The only companies in aerospace are huge, and huge companies are risk averse.
Musk also believes the vertical [integration] structure is critical to keeping costs down.
The commonly-recognized altitude where "space" starts is the Karman line, 62 mi (100 km) up.
Escape velocity just means the arc the path makes is broader than the curvature of the planet.
People think a rocket launch goes up, but really it's throwing something really hard sideways.
A good way to think about tons is that a car is about two tons.
You can get everything 99.9% right, and the last 0.1% will explode the rocket in a catastrophic failure. Space is hard.
Here's what SpaceX does: It takes things to space for people, for money.
Learning about rockets will make you respect the shit out of rockets.
Rockets have to burn fuel. The reason is Newton's Third Law: Every action has an equal an opposite reaction.
They [rockets] don't need anything external, like are, to push down on--by expelling a mass of hot gas, they're essentially pushing down on that.
An important statistic in the rocket engine world is thrust-to-weight ratio--i.e. how many times its own weight it can lift.
Sending a spacecraft out to space is hard--but bringing it back might be even harder.
The spacecraft's blistering speed in the upper atmosphere means the air in front of an incoming capsule doesn't have time to "get out of the way" and becomes super-compressed and blazingly hot.
The good news about failures is they show you exactly where your weak points are.
Falcon Heavy is a Falcon 9 except instead of having one first stage, it has three.
SpaceX wants to figure out how to make satellites for a much lower cost, and combined with their ability to launch much more cheaply, they'll be able to put cutting-edge satellites in orbit, do so frequently, and replace them just as frequently.
Musk plans to have SpaceX-manufactured internet satellites circling Mars down the road, bringing fast internet to the future Mars colony.
Propulsive landing means coming down the same way rockets take off.
Propulsive landing allows you to land a rocket or spacecraft neatly on a landing platform, with minimal damage to the vehicle.
Today, no one is talking about Mars, and very few people think of Mars as a relevant part of the near future. But unless I've missed something big or something unexpected happens, in about 10-20 years, people will start going to Mars. You could go to Mars in your lifetime. Crazy things are on the horizon.
There will always, always be important problems to address on Earth, but if we allow what's urgent here to prevent us from addressing what's important in the big picture, we're allowing ourselves to take a huge existential risk.
The story of humans and space is ultimately indistinguishable from the story of humans.
Terra-forming a planet means changing its conditions to match Earth's.
When it comes to most of the way we think, the way we make decisions, and the way we live our lives, we're much more like the flood geologists than the science geologists.
Musk-Speak is a language that describes everyday parts of life as exactly what they actually, literally are.
It's not that Musk suggests that people are just computers--it's that he sees people as computers on top of whatever else they are.
At it's simplest definition, a computer is an object that can store and process data--which the brain certainly is.
Thinking of the brain as a computer forces us to consider the distinction between our hardware and our software, a distinction we often fail to recognize.
A computer's software is defined as "the programs and other operating information used by a computer". For a human, that's what they know and how they think--their belief system, thought patterns, and reasoning methods.
What makes Musk's software so effective isn't its structure, it's that he uses it like a scientist.
Science is a way of thinking much more than it is a body of knowledge.
I think generally people's thinking process is too bound by convention or analogy to prior experiences. It's rare that people try to think of something on a first principles basis.
You have to build up the reasoning from the ground up--"from the first principles" is the phrase that's used in physics. You look at the fundamentals and construct your reasoning from that, and then you see if you have conclusion that works or doesn't work, and it may or may not be different from what people have done in the past.
A scientist gathers together only what he or she knows to be true--the first principles--and uses those as the puzzle pieces with which to construct a conclusion.
There are no axioms or proofs in science because nothing is for sure and everything we feel sure about might be dis-proven.
Theories are based on hard evidence and treated as truths, but at all times they're susceptible to being adjusted or dis-proven as new data emerges.
Hypothesis are built to be tested. Testing a hypothesis can disprove it or strengthen it, and if it passes enough tests, it can be upgraded to a theory.
One thing goal attainment often requires is laser focus.
Physics is fundamentally governed by the progress of engineering.
Musk's stated philosophy is, "When something is important enough, you do it even if the odds are not in your favor".
Musk sees people as computers, and he sees his brain software as the most important product he owns--and since there aren't companies out their designing brain software, he designed his own, beta tests it everyday, and makes constant updates.
Your entire life runs on the software in your head--why wouldn't you obsess over optimizing it?
Not only do most of us not obsess over our own software--most of us don't even understand our own software, how it works, or why it works that way.
"Because I said so" inserts a concrete floor into the child's deconstruction effort below which no further Why's may pass.
A command or lesson or a word of wisdom that comes without any insight into the steps of logic it was built upon is feeding a kid a fish instead of teaching them to reason.
Creative thinking is a close cousin of first principles reasoning. In both cases, the thinker needs to invent his own thought pathways.
People think of creativity as a natural born talent, but it's actually much more a way of thinking--it's the thinking version of painting onto a blank canvas.
Dogma is everywhere and comes in a thousand different varieties--but the format is generally the same: X is true because [authority] says so. The authority can be many things.
Only strong reasoning skills can carve a unique life path, and without them, dogma will quickly have you living someone else's life.
What most dogmatic thinking tends to boil down to it another good Seth Godin phrase: People like us do stuff like this.
A tribe is just a group of people linked together by something they have in common--a religion, an ethnicity, a nationality, family, a philosophy, a cause.
Tribalism is good when the tribe and the tribe member both have an independent identity and they happen to be the same.
Tribalism is bad when the tribe and tribe member's identity are one and the same.
A major part of the appeal of being in a tribe is that you get to be part of an Us, something humans are wired to seek out.
Nothing unites Us like a collectively hated anti-Us, and the blind tribe is usually defined almost as much by hating the dogma of Them as it is by abiding by the dogma of Us.
Most of the major divides in our world emerge from blind tribalism, and on the extreme end of the spectrum--where people are complete sheep--blind tribalism can lead to terrifying things.
The difference between the way Elon thinks and the way most people think is kind of like the difference between a cook and a chef.
The chef reasons from first principles, and for the chef, the first principles are raw edible ingredients. The cook works off of some version of what's already out there--a recipe of some kind.
The chef creates while the cook, in some form or another, copies. And the difference in outcome is enormous.
In order to form an immaculate member of a flock of sheep one must, above all, be a sheep.
What often feels like independent reasoning when zoomed out is actually playing connect-the-dots on a pre-printed set of steps laid out by someone else. What feels like personal principles might just be the general tenants of your tribe.
Musk is an impressive chef for sure, but what makes him such an extreme standout isn't that he's impressive--it's that most of us aren't chefs at all.
By not seeing our thinking software for what it is--a critical life skill, something that can be learned, practiced, and improved, and the major factor that separates the people who do great things from those who don't--we fail to realize where the game of life is really being played.
Conventional wisdom is slow to move, and there's significant lag time between when something becomes reality and when conventional wisdom is revised to reflect that reality. And by the time it does, reality has moved on to something else.
By ignoring conventional wisdom in favor of simply looking at the present for what it really is and staying up-to-date with the facts of the world as they change in real-time--in spite of what conventional wisdom has to say--the chef can act on information the rest of us haven't been given to act on yet.
People believe thinking outside the box takes intelligence and creativity, but it's mostly about independence. When you simply ignore the box and build your reasoning from scratch, whether you're brilliant or not, you end up with a unique conclusion--one that may or may not fall within the box.
Simply by refraining from reasoning by analogy, the chef opens up the possibility of making a huge splash with every project.
Anytime there's a curious phenomenon within humanity--some collective insanity we're all suffering from--it usually ends up being evolution's fault.
When it comes to reasoning, we're biologically inclined to be cooks, not chefs, which relates back to our tribal evolutionary past.
Thinking like cooks is what we're born to do because what we're born to do is survive. But the weird thing is, we weren't born into a normal human world. We're living in the anomaly, when for many of the world's people, survival is easy.
The problem is, most of our heads are still running on some version of the 50,000-year-old survival software--which kind of wastes the good luck we have to be born now.
I think there are three major epiphanies we need to absorb--three core things the chef knows that the cook doesn't:

1) You don't know shit.
2) No one else knows shit either.
3) You're playing Grand Theft Life.

The greatest enemy of knowledge is not ignorance, it is the illusion of knowledge.
The reason these outrageously smart people are so humble about what they know is that as scientists, they're aware that unjustified certainty is the bane of understanding and the death of effective reasoning.
If you were alone in a room with a car and wanted to figure out how it worked, you'd probably start by taking it apart as much as you could and examining the parts and how they all fit together. To do the same thing with our thinking, we need to revert to our four-year-old selves and start deconstructing our software by resuming the Why game our parents and teachers shut down decades ago.
The thing you really want to look closely for is unjustified certainty. When there's proof-level certainty, it means either there's some serious concrete and verified data underneath it--or it's faith-based dogma.
Everything around you that you call life was made up by people that were no smarter than you. And you can change it, you can influence it, you can build your own things that other people can use. Once you learn that, you'll never be the same again.
Being a trailblazer is just not respecting the beaten path and so deciding to blaze yourself a new one. Being a ground-breaker is just knowing that the ground wasn't laid by anyone that impressive and so feeling no need to keep it intact.
Not respecting society is totally counter intuitive to what we're taught when we grow up--but it makes perfect sense if you just look at what your eyes and experience tell you.
There are clues all around showing us that conventional wisdom doesn't know shit.
Sometimes it takes an actual experience to fully expose society for the shit it doesn't know.
To a chef, the world is one giant laboratory, and their life is one long lab session full of a million experiments.
The chef treats his goals and undertakings as experiments whose purpose is as much to learn new information as it is to be ends in themselves.
To a chef in the lab, negative feedback is a free boost forward in progress, courtesy of someone else. Pure upside.
There's no more reliable corollary than super-successful people thinking failure is fucking awesome.
The science approach is all about learning through testing hypotheses, and hypotheses are built to be dis-proven, which means that scientists learn through failure. Failure is a critical part of their success.b
Humans are programmed to take fear very seriously, and evolution didn't find it efficient to have us re-asses every fear inside us.
As far as society is concerned, when you give something a try--on the values front, the fashion front, the religious front, the career front--you've branded yourself. And since people like to simplify people in order to make sense of things in their own head, the tribe around you reinforces your brand by putting you in a clearly-labeled oversimplified box. What all this amounts to is that it becomes very painful to change.
Everyone is trying to their ass-covering.
Doing something out of your comfort zone and having it turn out okay is an incredibly powerful experience, one that changes you--and each time you have that kind of experience, it chips away at your respect for your brain's ingrained irrational fears.
So if we want to think like a scientist more often in life, those are the three key objectives--to be humbler about what we know, more confident about what's possible, and less afraid of things that don't matter.
In order for an energy source to be sustainable, it has to be both renewable and clean, which I'm not sure everyone realizes are different things--i.e. A) renewable so it won't run out and B) clean so it won't throw garbage into the atmosphere.
The sun radiates more energy to the Earth in a couple hours than all of humanity consumes from all sources each year.
Harnessing solar energy just cuts out the middlemen and goes straight to the source.
When you realize how little of the world you'd need to cover with solar panels in order to power all of humanity--especially since most of the panels would go on rooftops and not take up extra land--the more obvious a solar future seems.
SpaceX is trying to make human life multi-planetary by building a self-sustaining, one-million-person civilization on Mars.
It's kind of simple. If we get to a point where there are a million people on Earth who both want to go to Mars and can afford to go to Mars, there will be a million people on Mars.
If Mars is affordable and safe and you know you'll be able to come back, a lot of people will want to go.
Space travel is currently so expensive mostly because we land rockets by crashing them into the oceans (or incinerating them in the atmosphere).
Firing something super heavy and delicate and full of explosive liquid up through the atmosphere without anything going wrong is incredibly hard.
The moon is just over one light-second away.
Mars is more than three light-minutes away.
It's [the BFR] more than three times the mass and generates over three times the thrust of the gargantuan Saturn V--the rocket used in the Apollo mission--which currently stands as by far the biggest rocket humanity has made.
What I've been calling the Big Fucking Rocket this whole time is actually two things: a Big Fucking Spaceship sitting on top of a Big Fucking Booster.
The Raptor engine looks a lot like a Merlin, with one key difference--by significantly increasing the pressure, SpaceX has made the Raptor over three times more powerful than the Merlin.
10,000 flights. That's how many BFS trips to Mars Elon thinks it'll take to bring the Mars population to a million.
The people at SpaceX believe that once we're on Mars, the rest of the Solar System becomes accessible as well.
Language allows the best epiphanies of the very smartest people, through the generations, to accumulate into a little collective tower of tribal knowledge--a "greatest hits" of their ancestors' best "aha!" moments. Every new generation has this knowledge tower installed in their heads as their starting point in life, leading them to new, even better discoveries that build on what their ancestors learned, as the tribe's knowledge continues to grow bigger and wiser.
Language allows each generation to successfully pass a higher percentage of their learnings on to the next generation, so knowledge sticks better through time.
Language gives a group of humans a collective intelligence far greater than individual human intelligence and allows each human to benefit from the collective intelligence as if he came up with it all himself.
If language let humans send a though from one brain to another, writing let them stick a though onto a a physical object, like a stone, where it could live forever.
Computers can compute an organize and run complex software--software that can even learn on it's own. But they can't think in the way humans can.
Knowledge works like a tree. If you try to learn a branch of a leaf of a topic before you have a solid tree trunk foundation of understanding in your head, it won't work. The branches and leaves will have nothing to stick to, so they'll fall right out of your head.
The medulla oblongota really just wants you to not die. It does the thankless tasks of controlling involuntary things like your heart rate, breathing, and blood pressure along with making you vomit when it thinks you've been poisoned.
The pon's deals with swallowing, bladder control, facial expressions, chewing, saliva, tears, and posture--really just whatever it's in the mood for.
The midbrain deals with vision, hearing, motor control, alertness, temperature control and a bunch of other things that other people in the brain already do.
The cerebellum makes sure you stay a balanced, coordinated, and normal moving person.
The limbic system is a survival system. A decent rule of thumb is that whenever you're doing something that your dog might also do--eating, drinking, having sex, fighting, hiding, or running away from something scary--your limbic system is probably behind the wheel.
The cortex is in charge of basically everything--processing what you see, hear, and feel, along with language, movement, thinking, planning, and personality.
The evolution of our brain happened by building outwards, adding newer, fancier features on top of the existing model.
Because the cortex is so thin, it scales by increasing it's surface area. That means by creating lots of folds, you can more than triple the area of the brain's surface without increasing the volume too much.
Neuron's ability to alter themselves chemically, structurally, and even functionally, allow your brain's neural networks to optimize itself to the external world--a phenomenon called neuroplasticity.
The neuroplasticity that makes our brains so useful to us also makes them incredibly difficult to understand--because the way each of our brains works is based on how that brain has shaped itself, based on it's particular environment and life experience.
Inputting and outputting information is what the brain's neurons do. All the BMI (Brain Machine Interface) industry wants to do is get in on the action.
The progress of science, business, and industry are all at the whim of the progress of engineering.
The budding industry of brain-machine interfaces is the seed of a revolution that will change just about everything.
A whole-brain interface would give your brain the ability to communicate wirelessly with the cloud, with computers, and with the brains of anyone with a similar interface in their head.
The thing that people, I think, don't appreciate right now is that they are already a cyborg. You're already a different creature than you would have been twenty years ago, or even ten years ago.
You're already digitally superhuman.
We communicate with ourselves through thought and with everyone else through symbolic representations of thought, and that's all we can imagine.
Emotions are the quintessential example of a concept that words are poorly-equipped to accurately describe.
There's evidence from experiments with rats that it's possible to boost how fast a brain can learn--sometimes by 2x or even 3x--just by priming certain neurons to prepare to make a long-term connection.
New technology also comes along with real dangers and it always does end up harming a lot of people. But it also always seems to help a lot more people than it harms. Advancing technology almost always proves to be a net positive.
With Elon's companies, there's always some "result of the goal" that's his real reason for starting the company--the piece that ties the company's goal into humanity's better future.
The whole idea of "of the people, by the people, for the people" is the centerpiece of democracy. Unfortunately, "the people" are unpleasant. So democracy ends up being unpleasant. But unpleasant tends to be a dream compared to the alternatives.
AI is definitely going to vastly surpass human abilities.
Our minds evolved at a time when progress moved at a snail's pace, so that's what our hardware is calibrated to.

Pages

20180219

Computer Science from the Bottom Up

20180218

Modern Binary Exploitation

20180217

The Art of Insight in Science and Engineering: Mastering Complexity by Sanjoy Mahajan

20180216

TALENT IS OVERRATED by Geoff Colvin

20180215

The Elon Musk Post Series by Wait But Why