The best way to prepare for your final is to:
The final will be made available on BrightSpace around noon on the Monday of finals week and must be submitted by 11:59pm on the Friday of finals week. You will have 120 minutes to do the exam. Choose any two-hour period in the window to take the exam. You must complete the exam in one sitting. All problems will be multiple choice, multiple select, or matching. Submit on time; the penalty for late submissions is 5 points per minute.
Expect 20-30 questions. There will be no nasty all-of-the-above or none-of-the-above options. There will, however, occasionally be questions for which one answer is solid and correct, while others may make true, or nearly-always-true, statements but do not actually fit 100% with the asked question.
The intent of the questions is not to trick you. The intent is to ensure that students that have gained a fairly deep understanding of computer science fundamentals (linguistic and theoretical concepts) without having to do a one-hour web search, will be rewarded with a higher grade. All students can be this kind of student. If you have read the materials (and there was indeed a ton of reading!) and have participated in the writing of your term project (a real compiler!), you are that kind of student.
As always, you MAY use books, notes, and web searches to look things up. You will not be spied on: there is no browser lock down and hence no need to hide a mobile device in a bag of potato chips. However, you MAY NOT solicit answers in any way. There is to be no asking for help, no posting on forums, no communication with other humans or chat bots in any way; you can only “look things up,” you may never “ask.” You also MAY NOT post answers or help any other test taker either. You are bound by an honor code to follow these rules.
Review the learning objectives from the syllabus now. If there is any unmet objective, let the instructor know. For reference, the objectives are repeated here:
There are several things that all recipients of a computer science degree are expected to know. Ideally, these would be verified by an oral exam with a pass/fail outcome. Don’t worry, such an oral exam is not feasible, and may even be subject to the examinee failing due to nerves, so you’re going to test yourself in the comfort of your own space. Make sure you know the following:
Do you know how to answer each and every question? Can you articulate good answers to each? If so, great! Congratulations! This is the minimal criteria for passing.
You learned some theory and some bits of knowledge. But hopefully, you got so much more. You should have:
Why did you write a compiler?
Back to the academic side of things.
Review the course notes covered during lectures.
Here is a rough outline of the course material.
THEORIES OF COMPUTER SCIENCE What is a theory? Why do we have theories? Historical path to computer science as a discipline The four major theories of computation and their concerns Language Theory Concerned with how computations are expressed Automata Theory Concerned with how computations are performed Sneak peek: Turing Machines The stunning notion of computational universality Computation Theory Concerned with what can and cannot be computed Sneak peek: the halting problem is undecidable Complexity Theory Concerned with how efficiently computations can be performed Sneak peek: P vs. NP LANGUAGE THEORY Concerned with how computations are expressed Why study language theory? Information representation Formal Language Theory Symbols, Alphabets, Strings, Languages Operators on languages: Union, Intersection, Concatenation, Kleene Star How to formally define a language? Grammars Role of variables Grammar notation How strings are generated Lots of example grammars Parse Trees (aka Derivation Trees) Ambiguity Formal Definition Restrictions CFG: LHS is only one variable RLG: LHS is only one variable and RHS is symbols + at most one variable Type-1: RHS never shrinks (only one exception if empty string is in the language) Language Recognition Automata can be used for this Analytic Grammars Language Classification Chomsky Hierarchy for Formal Languages (original version) Regular (Type 3) Context-Free (Type 2) Type-1 (aka ”Context sensitive”) Unrestricted Larger Chomsky Hierarchy Finite Regular Context-Free “Context-Sensitive” Recursive Recursively Enumerable (r.e.) Finitely Describable Programming Language Theory How PLT differs from formal language theory Concerns (Just a list for now) Syntax Semantics Type Theory Static Analysis Translation Runtime Systems Verification Metaprogramming Classification SYNTAX Motivation (there is a structure underlying all programs) Many ways to express this structure as a string Definition of Syntax Syntax Diagrams Lexical vs. Phrase Syntax Why this is massively important Ways to represent the difference Tokens Parse Trees The frontier of the parse tree is the token stream Dealing with Ambiguity Precedence (and how to capture it in a grammar) Associativity (and how to capture it in a grammar) Parsing (sneak peek only) Hand-crafted, recursive descent Parser generators Analytic Grammars PEGs Ohm The Problem of Context Things you cannot capture in a context-free grammar, incomplete list: No redeclare within scope No use of possibly uninitialized variables Type checking Correct number of arguments must appear in a call Access modifiers must be correct All execution paths through a function must end in a return All abstract methods must be implemented or declared abstract All declared local variables must be used All private methods in a class must be used Is this stuff syntax or semantics? People can disagree Side note: can be formalized in theory but why bother Type inference Abstract Syntax What ASTs look Like Difference between CSTs (Parse trees) and ASTs Tree grammars to formally define ASTs Esprima Examples in JavaScript Examples in Java Aside: Different syntax formalisms in the real world LANGUAGE DESIGN Things to know Major features of existing programming languages Historical Issues What Bret Victor says about the 1960s and 1970s What Alan Kay thinks The process of language design Big picture and big questions Starter set of features Design your abstract syntax Sketch and Prototype with Ohm!!! Start working on lower-level syntax What kind of sugar do you want? Differences between syntax, semantics, pragmatics Ohm for language design Ohm grammar notation Ohm details Examples of Ohm grammars Case study: Astro Case study: Bella Case study: Carlos COMPILERS Translators vs. interpreters Compilers, assemblers, transpilers AOT vs. JIT Overall structure of translation Analysis -> Generation Analysis -> Optimization -> Generation Parsing -> Static Analysis -> Optimization -> Code Generation Lexical Analysis characters to tokens Syntax Analysis = Parsing Tokens to CSTs Semantic Analysis = Static Analysis CSTs to ASTs But ASTs are not really trees Type checking and other semantic analysis Intermediate Representations Why have them? Sneak peek: later phases of the compiler Control Flow Analysis Data Flow Analysis Optimization of decorated AST Production of high-level language code Production of abstract intermediate structures Production of bytecode Production of abstract assembly language Machine independent optimization Modern compilers are not just one-shot translators How to architect a compiler using Ohm parser.js analyzer.js Representing context Checks, especially type checking optimizer.js generator.js core.js compiler.js <your-language-name>.js Tests for compiler, parser, analyzer, optimizer, generator Why you should write a compiler AUTOMATA THEORY Concerned with how computations can be carried out Broad classification of automata Transducers vs Recognizers/Deciders Tapes vs Registers State Machines vs Instruction Lists Harvard vs von Neumann Architecture Turing Machines How they work Many Examples Variations that neither restrict nor expand computing power Multi-track Multi-head Multi-tape Queue Variations that restrict computing power LBAs: Bounded tape PDAs: Input is read-only, read left-to-right once, memory is a stack FAs: Input is read-only, read left-to-right once, no memory Register Machines Counter machines RAMs Other “Automata-like” Formalisms String rewriting systems λ-Calculus Brainf**k Recursive Functions (not covered in class)Applications to Intermediate RepresentationsWhy have them? Analysis/Synthesis is inherent to translation Break down complex problem Retargetability For machine independent optimizations High-level vs. Medium-level vs. Low-level Styles Abstract assembly language (instructions called tuples) Stack code List of well-known IRs JVM CLR LLVM SIL CIL TuplesApplications to Virtual Machines and Real MachinesMachine Architecture How machines work (review) Intel 64 architecture Review of x86-64 Assembly Language Registers and instructions Calling conventions Parallel instructionsCode GenerationGoals Translation to JavaScript Translation to Assembly Language Naïve Interpretive Code generator generators Generation of real assembly language Address assignment Instruction selection Register allocation Low-level optimization Understanding the runtime system for block-structured languages Stack frames Dynamic links Static links Register save area Register spilling PARSING THEORY What is parsing? Lexical vs Syntactic parsing Regular expressions In theory (type-3) In practice Common notation for Regexes in modern languages ( ) [ ] { } ^ $ . \ ? * + | Uses: validation, search, extraction, replace Groups Quantifiers Eager: * + ? {} Reluctant: *? +? ?? {}? Possessive: *+ ++ ?+ {}+ Backreferences \1 \2 ... Anchors: ^ $ \A \Z \b \B Lookarounds: ?= ?! ?<= ?<! Performance concernsApproaches to parsingTop-down, LL, Expand-Match Bottom-up, LR, Shift-ReduceRecursive DescentPEGsParsing in the Real worldCOMPUTABILITY THEORY Concerned with what can and cannot be computed History: Hilbert, Gödel, Church, Turing Bernhardt book Wadler video So many equivalent models, all Turing-complete, hence the Church-Turing Thesis Of course there are uncomputable functions: Diagonalization Halting Problem is undecidable Limits Non-computable functions = Non-recognizable languages Non-decidable problems = Non-decidable languages Reductions Rice’s Theorem Chomsky Hierarchy: The Full Version Finite = S->a|b|c = Non-looping FAs Regular = Right Linear Grammar = Finite Automata Deterministic Context Free = LR = DPDA Context Free = CFG = (N)PDA Type-1 = Linear Bounded Automata Decidable (Recursive) = Turing Machines that always Halt Recognizable = r.e. = Turing Machines Finitely Describable (no machine out here) Beyond Finitely Describable 🤯COMPLEXITY THEORYConcerned with how expensive certain computations are Time complexity Space complexity Theory Big-O, Big-Theta, Big-Omega Little-O, Little-Theta, Little-Omega Asymptotic Notation P vs. NP NP-Completeness The Complexity Zoo Practice: Optimization in Compilers Code Optimization Machine independent vs. machine dependent Constant folding Strength reductions Algebraic simplifications Operand reordering Unreachable code elimination Dead code elimination Copy propagation CSE Loop unrolling Special purpose instructions e.g. muladd, range, conditional jump Loop invariant factoring Tail recursion elimination Induction variable simplification Static frame allocation Stack frame simplification Low-level optimizations Special instructions Alignment Cache Removing conditional jumps Scheduling to remove load delays and similar things
On the course practice page, do the reinforcement problems related to the portions of the course that the exam will cover.
Here are things you should be able to do before taking the final. Quiz yourself. Quiz each other.
Do the Practice Final on BrightSpace! It is not as long as the real final, but making sure you know how to use BrightSpace as a platform for taking tests will help make sure you don't have a bad day and get frustrated by the technology during the real exam.
You have to put in the time for effortful self-study. Although the exam is open resources, you will not have time to look everything up. Those who come in with a strong comfort level with the material will finish on time. I am assessing your fluency and your proficiency with the material, not your Google-Fu.
An education is a long-term life journey. Education goes way beyond your chosen field and way beyond academics in general. That said, there is much to be gained by immersing oneself in the history, theory, and practice, of computer science. Our culture is primarily literary, so to that end, you have been assigned a great deal of reading? Were you able to read or skim everything? I hope so, but if not, find time to catch up (or at least please consider catching up in the near future). Among the readings that will be helpful in your journey to becoming a computer scientist, review: