Justin Spencer: The Algorithm Design Manual by Steven S. Skiena

There are three desirable properties for a good algorithm. We seek algorithms that are correct and efficient, while being easy to implement. These goals may not be simultaneously achievable.
There is a fundamental difference between algorithms, which always produce a correct result, and heuristics, which may usually do a good job, but without providing any guarantee.
Reasonable-looking algorithms can easily be incorrect. Algorithm correctness is a property that must be carefully demonstrated.
The heart of any algorithm is an idea. If your idea is not clearly revealed when you express an algorithm, then you are using too low-level a notation to describe it.
It is impossible to prove the correctness of an algorithm for a fuzzily-stated problem. Put another way, ask the wrong problem and you will get the wrong answer.
An important and honorable technique in algorithm design is to narrow the set of allowable instances until there is a correct and efficient algorithm.
The best way to prove that an algorithm is incorrect is to produce an instance in which it yields an incorrect answer. Such instances are called counter-examples.
Hunting for counter-examples is a skill worth developing. It bears some similarity to the task of developing test sets for computer programs, but relies more on inspiration then exhaustion.
Searching for counterexamples is the best way to disprove the correctness of a heuristic.
Mathematical induction is usually the right way to verify the correctness of a recursive or incremental insertion algorithm.
Modeling is the art of formulating your application in terms of precisely described, well-understood problems. Proper modeling is the key to applying algorithmic design techniques to real-world problems. Indeed, proper modeling can eliminate the need to design or even implement algorithms, by relating your application to what has been done before.
Modeling is only the first step in designing an algorithm for a problem.
Modeling your application in terms of well-defined structures and algorithms is the most important single step towards a solution.
Learning to think recursively is learning to look for big things that are made from smaller things of exactly the same type as the big thing.
Recursive structures occur everywhere in the algorithmic world.
Algorithms are the most important and durable part of computer science because they can be studied in a language and machine independent way.
Algorithms can be understood and studied in a language and machine independent manner.
The worst-case complexity of the algorithm is the function defined by the maximum number of steps taken in any instance of size n.
The best-case complexity of the algorithm is the function defined by the minimum number of steps taken in any instance of size n.
The average-case complexity of the algorithm, which is the function defined by the average number of steps over all instances of size n.
The Big-Oh notation and worst-case analysis are tools that greatly simplify our ability to compare the efficiency of algorithms.
Although esoteric functions arise in advanced algorithm analysis, a small variety of time complexities suffice and account for most algorithms that are widely used in practice.
A basic rule of thumb in Big-Oh analysis is that worst-case running time follows from multiplying the largest number of times each nested loop can iterate.
Pattern matching is the most fundamental algorithmic operation on text strings.
A logarithm is simply an inverse exponential function.
Exponential functions grow at a distressingly fast rate. Thus, inverse exponential functions--i.e. Logarithms--grow refreshingly slowly.
Logarithms arise in any process where things are repeatedly halved.
Binary search is one of the most powerful ideas in algorithm design.
Logarithms arise whenever things are repeatedly halved or doubled.
The maximum benefit from good date structures results from designing your program around them in the first place.
Data structures can be neatly classified as either continuous or linked, depending upon whether they are based on arrays or pointers.
Contiguously-allocated structures are composed of single slabs of memory, and include arrays, matrices, heaps, and hash tables.
Linked data structures are composed of distinct chunks of memory bound together by pointers, and include lists, trees, and graph adjacency lists.
The array is the fundamental contiguously-allocated data structure. Arrays are structures of fixed-sized data records such that each element can be efficiently located by its index or address.
Physical continuity between successive data accesses helps exploit the high-speed cache memory on modern computer architectures.
Pointers are the connections that hold the pieces of linked structures together.
Pointers represent the address of a location in memory.
A variable storing a pointer to a given data item can provide more freedom than storing a copy of the item itself.
Dynamic memory allocation provides us with flexibility on how and where we use our limited storage resources.
Stacks are simple to implement and very efficient. For this reason, stacks are probably the right container to use when retrieval order doesn’t matter at all, such as when processing batch jobs.
Algorithmically, LIFO tends to happen in the course of executing recursive algorithms.
Queues are somewhat trickier to implement than stacks and thus are more appropriate for appliances where the order is important.
The dictionary data type permits access to data items by content. You stick an item into a dictionary so you can find it when you need it.
Data structure design must balance all the different operations it supports. The fastest data structure to support both operations A and B may well not be the fastest structure to support either operation A or B.
Picking the wrong data structure for the job can be disastrous in terms of performance. Identifying the very best data structure is usually not as critical, because there can be several choices that perform similarly.
Building algorithms around data structures such as dictionaries and priority queues leads to both clean structure and good performance.
Greedy heuristics always try to grab the best possible thing first.
When working with a large enough data set, only linear or near linear algorithms are likely to be fast enough.
Choosing the right data structure is often the key to getting the time complexity down to this point.
Using smart heuristics like greedy is likely to significantly improve quality over the naive approach.
Hash tables are a very practical way to maintain a dictionary. THey exploit the fact that looking an item up in an array takes constant time once you have its index.
A hash function is a mathematical function that maps keys to integers. We will use the value of our hash function as an index into an array, and store our item at that position.
Chaining is the easiest approach to collision resolution.
Pragmatically, a hash table is often the best data structure to maintain a dictionary.
Strings are sequences of characters where the order of the characters matters.
Text strings are fundamental to a host of computing applications, from programming language parsing/compilation, to web search engines, to biological sequence analysis.
The primary data structure for representing strings is an array of characters. This allows us constant-time access to the i-th character of the string.
The most fundamental operation on text strings is substring search.
The key idea of hashing is to represent a large object using a single number. The goal is a representation of the large object by an entity that can be manipulated in constant time, such that it is relatively unlikely that two different large objects map to the same value.
Hashing has a variety of clever applications beyond just speeding up search.
Suffix trees are amazing data structures.
Sorting is the basic building block that many other algorithms are built around. By understanding sorting, we obtain an amazing amount of power to solve other problems.
An important algorithm design technique is to use sorting as a basic building block, because many other problems become easy once a set of items is sorted.
Never be afraid to spend time sorting, provided you use an efficient sorting routine.
Sorting lies at the heart of many algorithms. Sorting the data is one of the first things any algorithm designer should try in the quest for efficiency.
Heaps are a simple and elegant data structure for efficiently supporting the priority queue operations insert and extract-min.
Although other algorithms prove slightly faster in practice, you won’t go wrong using heapsort for sorting data that sites in the computer’s main memory.
Recursive algorithms reduce large problems into smaller ones.
Randomization is a powerful tool to improve algorithms with bad worst-case but good average-case complexity. It can be used to make algorithms more robust to boundary cases and more efficient on highly structured input stances that confound heuristic decisions.
The fundamental step in quicksort is partitioning elements around a pivot.
Sorting can be used to illustrate most algorithm design paradigms. Data structure techniques, divide-and-conquer, randomization, and incremental construction all lead to efficient sorting algorithms.
Binary search is a fast algorithm for searching in a sorted array of keys.
Binary search and its variants are the quintessential divide-and-conquer algorithms.
One of the most powerful techniques for solving problems is to break them down into smaller, more easily solved pieces. Smaller problems are less overwhelming, and they permit us to focus on details that are lost when we are studying the entire problem.
Effective parallel processing requires decomposing jobs into at least as many tasks as processors, and is becoming more important with the advent of cluster computing and multicore processors.
The key to solving many algorithmic problems is to think of them in terms of graphs. Graph theory provides a language for talking about the properties of relationships, and it is amazing how often messy applied problems have a simple description and solution in terms of classical graph properties.
Designing truly novel graph algorithms is a very difficult task. The key to using graph algorithms effectively in applications lies in correctly modeling your problem so you can take advantage of existing algorithms.
Graphs can be used to model a wide variety of structures and relationships. Graph-theoretic terminology gives us a language to talk about them.
Adjacency lists are the right data structure for most applications of graphs.
The take-home lesson is that even elementary problems like initializing data structures can prove to be bottlenecks in algorithm development. Most programs working with large amounts of data have to run in linear or almost linear time.
Breadth-first and depth-first searches provide mechanisms to visit each edge and vertex of the graph. They prove the basis of most simple, efficient graph algorithms.
By storing the vertices in a first-in, first-out (FIFO) queue, we explore the oldest unexplored vertices first.
By storing the vertices in a last-in, first-out (LIFO) stack, we explore the vertices by lurching along a path, visiting a new neighbor if one is available, backtracking up only when we are surrounded by previously discovered vertices.
DFS organizes vertices by entry/exit times, and edges into tree and back edges. This organization iw what gives DFS its real power.
As algorithm design paradigms go, a depth-first search isn’t particularly intimidating. It is surprisingly subtle, however meaning that its correctness requires getting details right.
Greedy algorithms make the decision of what to do next by selecting the best local option from all available choices without regard to the global structure.
Most applications of graphs can be reduced to standard grph properties where well-known algorithms can be used. These include minimum spanning tees, shortest paths, and other problems presented in the catalog.
The maximum flow from s to t always equals the weight of the minimum s-t cut. Thus, flow algorithms can be used to solve general edge and vertex connectivity problems in graphs.
Proper modeling is the key to making effective use of graph algorithms.
Designing novel graph algorithms is very hard, so don’t do it. Instead, try to design graphs that enable you to use classical algorithms to model your problem.
Backtracking is a systematic way to iterate through all the possible configurations of a search space.
Combinatorial searches, when augmented with tree pruning techniques, can be used to find the optimal solution of small optimization problems. How small depends upon the specific problem, but typical size limits are somewhere between 15 <= N <= 50 items.
Successful pruning requires looking ahead to see when a solution is doomed to go nowhere, and backing off as soon as possible.
Even simple pruning strategies can suffice to reduce running time from impossible to instantaneous.
Clever pruning can make short work of surprisingly hard combinatorial search problems. Proper pruning will have a greater impact on search time than any other factor.
The simplest method to search in a solution space uses random sampling. It is also called the Monte Carlo method. We repeatedly construct random solutions and evaluate them, stopping as soon as we get a good enough solution, or (more likely) when we are tired of waiting. We report the best solution found over the course of our sampling.
Simulated annealing is effective because it spends much more of its time working on good elements of the solution space than on bad ones, and because it avoids getting trapped repeatedly in the same local optima.
Simulated annealing is a simple but effective technique for efficiently obtaining good but not optimal solutions to combinatorial search problems.
I have never encountered any problem where genetic algorithms seemed to me the right way to attack it. Further, I have never seen any computational results reported using genetic algorithms that have favorably impressed me. Stick to simulated annealing for your heuristic search voodoo needs.
Your time spent parallelizing code might be better spent enhancing the sequential version.
I recommend considering parallel processing only after attempts at solving a problem sequentially prove too slow. Even then, I would restrict attention to algorithms that parallelize the problem by partitioning the input into distinct tasks where no communication is needed between the processors, except to collect the final results.
Once you understand it, dynamic programming is probably the easiest algorithm design technique to apply in practice.
Dynamic programming is a technique for efficiently implementing a recursive algorithm by storing partial results. The trick is seeing whether the naive recursive algorithm computes the same subproblems over and over and over again. If so, storing the answer for each subproblems in a table to look up instead of recompute can lead to an efficient algorithm.
Start with a recursive algorithm or definition. Only once we have a correct recursive algorithm do we worry about speeding it up by using a results matrix.
Dynamic programming is generally the right method for optimization problems on combinatorial objects that have an inherent left to right order among components.
Dynamic programming is essentially a tradeoff of space for time. Repeatedly recomputing a given quantity is harmless unless the time spent doing so becomes a drag on performance. Then we are better off storing the results of the initial computation and looking them up instead of recomputing them again.
Explicit caching of the results of recursive calls provides most of the benefits of dynamic programming, including usually the same running time as the more elegant full solution. If you prefer doing extra programming to more subtle thinking, you can stop here.
Once you understand dynamic programing, it can be easier to work out such algorithms from scratch than to try to look them up.
For any optimization problem on left-to-right objects, such as characters in a string, elements of a permutation, points around a polygon, or leaves in a search tree, dynamic programming likely leads to an efficient algorithm to find the optimal solution.
Dynamic programming algorithms are only as correct as the recurrence relations they are based on.
The running time of any dynamic programming algorithm is a function of two things: (1) number of partial solutions we must keep track of, and (2) how long it will take to evaluate each partial solution.
Without an inherent left-to-right ordering on the objects, dynamic programming is usually doomed to require exponential space and time.
The global optimum is often noticeably better than the solution found by typical heuristics. How important this improvement is depends on your application, but it can never hurt.
Reductions are a way to show that two problems are essentially identical. A fast algorithm for one of the problems implies a fast algorithm for the other.
A small set of NP-complete problems suffice to prove the hardness of most other hard problems.
Approximation algorithms guarantee answers that are always close to the optimal solution. They can provide a practical approach to dealing with NP-complete problems.
The key to algorithm design (or any other problem-solving task) is to proceed by asking yourself questions to guide your thought process.
Should you get stuck on the problem, the best thing to do is move onto the next question.
By clearly articulating your reasoning as to why something doesn’t work, you can check whether you have glossed over a possibility that you didn’t want to think hard enough about. It is amazing how often the reason you can’t find a convincing explanation for something is because your conclusion is wrong.
The distinction between strategy and tactics is important to keep aware of during any design process. Strategy represents the quest for the big picture--the framework around which we construct our path to the goal. Tactics are used to win the minor battles we must fight along the way.
In problem-solving, it is important to check repeatedly whether you are thinking on the right level.
Problem-solving is not a science, but part art and part skill. It is one of the skills most worth developing.
Data structures are not so much algorithms as they are the fundamental constructs around which you build your application. Becoming fluent in what the standard data structures can do for you is essential to get full value from them.
The abstract data type “dictionary” is one of the most important structures in computer science.
In practice, it is more important to avoid using a bad data structure than to identify the single best option available.
For applications involving a moderate-to-large number of keys (say between 100 and 10,000,000), a hash table is probably the right way to go.
Balanced search trees use local rotation operations to restructure search trees, moving more distant nodes closer to the root while maintaining the in-order search structure of the tree.
Which tree is best for your application? Probably the one of which you have the best implementation. The flavor of balanced tree is probably not as important as the skill of the programmer who coded it.
Once a data structure has to be stored outside of main memory, the search time grows by several orders of magnitude.
The idea behind a B-tree is to collapse several levels of a binary search tree into a single large node, so that we can make the equivalent of several search steps before another disk access is needed.
Priority queue are useful data structures in simulations, particularly for maintaining a set of future events ordered by time. The are called “priority” queue because they enable you to retrieve items not by the insertion time (as in a stack or queue), nor by a key match (as in a dictionary), but by which item has the highest priority of retrieval.
Suffix trees and arrays are phenomenally useful data structures for solving string problems elegantly and efficiently. Proper use of suffix trees oten speeds up string processing algorithms [...] to linear time--likely the answer.
Tries are very simple to build (repeatedly insert new strings) and very fast to search, although they can be expensive in terms of memory.
Special efforts must be taken to represent very large graphs efficiently.
The bottom line is that you should try to avoid working in high-dimensional spaces, perhaps by discarding (or projecting away) the least important dimensions.
Numerical algorithms typically perform repeated floating-point computations, which accumulate error at each operation until, eventually, the results are meaningless.
A simple and dependable way to test for round-off errors in numerical programs is to run them both at single and double precision, and then think hard whenever there is a disagreement.
Linear programming is the most important problem in mathematical optimization and operations research.
The simplex method is the standard algorithm for linear programming. Each constraint in a linear programming problem acts like a knife that carves away a region from the space of possible solutions. We seek the point within the remaining region that maximizes (or minimizes) f(X).
The bottom line on linear programming is this: you are much better off using an existing LP code than writing your own.
Linear programming is an algorithmic problem of such economic important that commercial implementations are far superior to free versions.
It is fundamentally impossible to produce truly random numbers on any deterministic device.
The best we can hope for are pseudorandom numbers, a stream of numbers that appear as if they were generated randomly.
The standard algorithm of choice [for generating random numbers] is the linear congruential generator. It is fast, simple, and (if instantiated with the right constraints) gives reasonable pseudorandom numbers.
Note that the stream of numbers produced by a linear congruential generator repeats the instant the first number repeats.
High but not arbitrary-precision arithmetic can be conveniently performed using Chinese remainder theorem and modular arithmetic.
Taking the Fourier transform of a function is equivalent to representing it as the sum of sine functions.
By eliminating undesirable high and or low frequency components and taking an inverse fourier transform to get us back into the time domain, we can filter an image to remove noise and other artifacts.
A convolution is the pairwise product of elements from two different sequences, such as in multiplying two n-variable polynomials f and g or comparing two character strings.
Indeed, “when in doubt, sort” is one of the first rules of algorithm design.
Degeneracy refers to annoying special cases that must be treated in substantially different ways, such as when two lines intersect in more or less than a single point.
There are three primary approaches to dealing with degeneracy:

Ignore it--Make as an operating assumption that your program will work correctly only if no three points are collinear, no three lines meet at a point, no intersections happen at the endpoints of line segments, etc.
Fake it-- Randomly or symbolically perturb your data so that it becomes nondegenerate.
Deal with it--Geometric applications can be made more robust by writing special code to handle each of the special cases that arise.

Geometric computations often involve floating-point arithmetic, which leads to problems with overflows and numerical precision. There are three basic approaches to the issue of numerical stability:

Integer arithmetic--By forcing all points of interest to lie on a fixed-size integer grid, you can perform exact comparison to test whether any two points are equal or two line segments interest.
Double precision reals--By using double-precision floating point numbers, you may get lucky and avoid numerical errors.
Arbitrary precision arithmetic--This is certain to be correct, but also to be slow.

Sets are collections of symbols whose order is assumed to carry no significant, while strings are defined by the sequence or arrangement of symbols.
The assumption of a fixed order makes it possible to solve string problems much more efficiently than set problems, through techniques such as dynamic programming and advanced data structures like suffix trees.
A good algorithm designer does not reinvent the wheel, and a good programmer does not rewrite code that other people have written. Picasso put it best: “Good artists borrow. Great artists steal.”

Justin Spencer

Pages

20190526

The Algorithm Design Manual by Steven S. Skiena

No comments:

Post a Comment