Algorithms
Here it is, from thirty-thousand feet, as they say.
Why Study Algorithms
Studying algorithms helps you develop widely applicable
analytical skills—you learn not just how to solve individual
problems but to develop recipies for getting answers to many problems.
“A person well-trained in computer science knows
how to deal with algorithms: how to construct them, manipulate them,
understand them, analyze them. This knowledge is preparation for much
more than writing good computer programs; it is a general-purpose mental
tool that will be a definite aid to the understanding of other subjects,
whether they be chemistry, linguistics, or music, etc. The reason for
this may be understood in the following way: It has often been said
that a person does not really understand something until after teaching
it to someone else. Actually, a person does not really understand
something until after teaching it to a computer, i.e., expressing
it as an algorithm . . . An attempt to formalize things as algorithms
leads to a much deeper understanding than if we simply try to comprehend
things in the traditional way.” — Donald Knuth
And:
“Algorithmics is more than a branch of computer science.
It is the core of computer science, and, in all fairness, can be said
to be relevant to most of science, business, and technology.”
— David Harel
Classifying Algorithms
Primary dimemsions:
- By domain
- By complexity class
- By underlying data structure
- By problem solving strategy IMPORTANT
Secondary concerns include:
- Is the algorithm sequential or parallel?
- Is the algorithm deterministic or randomized?
- Are you looking for one solution or all solutions?
- Are you looking for any solution or the optimal solution?
- Are you looking for an exact or approximate solution?
- Are you computing
- Whether $f(x)$ is true or not (decision),
- The value of $f(x)$ (functional),
- Any $x$ such that $f(x)=y$ (constraint), or
- The $x$ that minimizes (or maximizes) $f(x)$ (optimization)?
Problem Domain
What kind of a problem are you trying to solve?
- Numeric (esp. super large numbers, floating-point numbers with errors, ...)
- Sorting (putting things in order)
- Searching (finding an element by search key)
- String (pattern matching, compression, sequencing, crypto, ...)
- Partitioning (fair division, ...)
- Network (routing, spanning trees, connectivity, flow, ...)
- Combinatoric (permutations, combinations, subsets, ...)
- Geometric (convex hull, closest pair, bounding sphere, ...)
- AI (computer vision, machine learning, ...)
Complexity Class
Let $n$ be the input size and $k$ be a fixed constant.
- Constant ($k$)
- Logarithmic ($\log{n}$)
- Linear ($n$)
- Linearithmic ($n \log{n}$)
- Polynomial ($n^{k} \;\mathrm{where}\:k > 1$)
- Exponential ($k^n$)
- Superexponential (ZOMG)
Underlying Data Structure
There is a huge
list of data structures at Wikipedia, but here are some of the more common
ones:
- Arrays
- Sorted Arrays
- Linked Lists (singly-linked, doubly-linked, with sentinels, ...)
- Hashtables (with several collision resolution strategies)
- Unbalanced Trees
- Balanced Trees
- Skip Lists
- Heaps (min, max, binary, Fibonacci)
Strategies
This isn’t a complete list, but it covers some of the more famous ones:
- Brute Force. Enumerate all possible solutions, unintelligently, and try them all.
- Divide and Conquer. Break down a problem into multiple independent
subproblems, solve the subproblems (recursively), and combine those solutions into a solution
for the original problem.
- Decrease and Conquer. Like Divide and Conquer but break down in to one subproblem.
- Transform and Conquer. Reformulate problem into an equivalent problem in another domain.
- Input Enhancement. Precalculate results for certain inputs, or create
a cache of partial results, so that the actual run is faster.
- Prestructuring. Pre-arrrange data for faster processing.
- Dynamic Programming. Solve an optimization problem by breaking it down into
multiple overlapping subproblems, solving the
subproblems (recursively), and combining those
solutions into a solution for the original problem.
- The Greedy Method. When searching for solution, take the most promising next
step and never backtrack.
- Backtracking. Systematically generate possible solutions to a problem
sometimes having to back up when realizing your paritally generated candidate
can’t possibly be extended to a real solution.
- Branch and Bound. Bactrack in an optimization setting.
- Hill Climbing. Solve (or find an approximate solution to) an optimization
problem by generating candidate solutions that are (hopefully)
improvements over the previous candidate.
- Particle Swarm. Solve an optimization problem with a bunch of
decentralized particles all searching for a solution with something that looks
like its has a collective organization (e.g. ant colonies, bird flocks, animal
herds, etc.
- Las Vegas. Generate possible solutions non-deterministically in a way
that always produces the correct answer but makes no guarantees on how long it
will run or how much space it will need (in the worst case).
- Monte Carlo. Generate possible solutions non-deterministically in
a way that has has time and space guarantees but has a small probablility
of giving the wrong answer. The probability of error can be reduced by running
the algorithm longer.