LMU ☀️ CMSI 3802
LANGUAGES AND AUTOMATA II
HOMEWORK #2 Due: 2024-02-16

Learning Objectives

In this assignment you will demonstrate:

Readings, Videos, and Self-Study

Although your team will be turning in only one submission, every team member is individually responsible for each of the following:

Instructions

As usual, turn in a single submission for your entire group via BrightSpace. Remember to not simply partition the work among team members, as this reduces your learning.

Your team's BrightSpace submission should be an attached PDF or text file with:

  1. The names of every student
  2. For each student, an affidavit that each of the readings and watchings above were individually completed
  3. Answers to each ot the “Problems to Turn In,” numbered and neatly typeset
  4. A link to the public GitHub repository hosting your team’s project

Problems to Turn In

Solutions to these problems will appear in the beautifully formatted PDF that you will submit to BrightSpace. For problems involving code to write, host your code on GitHub or Replit or OneCompiler (or similar) and provide the link to the hosting site in your BrightSpace submission.

  1. Give tiny little Ohm grammars for:
    1. Canadian Postal Codes (make sure to prohibit D, F, I, O, Q, U)
    2. Legal Visa® Card Numbers, ignoring the Luhn checksums, i.e., accept 4 + 15 digits or 4 + 12 digits.
    3. Legal MasterCard® Numbers, ignoring the Luhn checksums, i.e., accept 51-55 + 14 digits or 2221-2720 + 12 digits.
    4. Strings of Unicode letters except those strings that are exactly three letters ending with two Latin letter o’s, of any case.
    5. Binary numerals divisible by 16.
    6. Decimal numerals in the range 8 through 32, inclusive.
    7. All strings of Unicode letters, except python, pycharm, or pyc.
    8. Floating point constants that are allowed to have an empty fractional part, but whose exponent part is required and can have no more than three digits in the exponent part
    9. Palindromes over the letters a, b, and c, of length 2, 3, 5, or 8
    10. Python string literals. Don't get too fancy here—just translate what you see in the Python Reference linked above into Ohm notation.

    Here is a Mocha test for you:

    import assert from "assert"
    import { matches } from "../ohm_exercises.js"
    
    const testFixture = {
      canadianPostalCode: {
        good: ["A7X 2P8", "P8E 4R2", "K1V 9P2", "Y3J 5C0"],
        bad: ["A7X   9B2", "C7E 9U2", "", "Dog", "K1V\t9P2", " A7X 2P8", "A7X 2P8 "],
      },
      visa: {
        good: ["4128976567772613", "4089655522138888", "4098562516243"],
        bad: [
          "43333",
          "42346238746283746823",
          "7687777777263211",
          "foo",
          "π",
          "4128976567772613 ",
        ],
      },
      masterCard: {
        good: [
          "5100000000000000",
          "5294837679998888",
          "5309888182838282",
          "5599999999999999",
          "2221000000000000",
          "2720999999999999",
          "2578930481258783",
          "2230000000000000",
        ],
        bad: [
          "5763777373890002",
          "513988843211541",
          "51398884321108541",
          "",
          "OH",
          "5432333xxxxxxxxx",
        ],
      },
      notThreeEndingInOO: {
        good: ["", "fog", "Tho", "one", "a", "ab", "food"],
        bad: ["fOo", "gOO", "HoO", "zoo", "MOO", "123", "A15"],
      },
      divisibleBy16: {
        good: ["0", "00", "000", "00000", "00000", "000000", "00000000", "1101000000"],
        bad: ["1", "00000000100", "1000000001", "dog0000000"],
      },
      eightThroughThirtyTwo: {
        good: Array(25)
          .fill(0)
          .map((x, i) => i + 8),
        bad: ["1", "0", "00003", "dog", "", "361", "90", "7", "-11"],
      },
      notPythonPycharmPyc: {
        good: ["", "pythons", "pycs", "PYC", "apycharm", "zpyc", "dog", "pythonpyc"],
        bad: ["python", "pycharm", "pyc"],
      },
      restrictedFloats: {
        good: ["1e0", "235e9", "1.0e1", "1.0e+122", "55e20"],
        bad: ["3.5E9999", "2.355e-9991", "1e2210"],
      },
      palindromes2358: {
        good: [
          "aa",
          "bb",
          "cc",
          "aaa",
          "aba",
          "aca",
          "bab",
          "bbb",
          "ababa",
          "abcba",
          "aaaaaaaa",
          "abaaaaba",
          "cbcbbcbc",
          "caaaaaac",
        ],
        bad: ["", "a", "ab", "abc", "abbbb", "cbcbcbcb"],
      },
      pythonStringLiterals: {
        good: String.raw`''
          ""
          'hello'
          "world"
          'a\'b'
          "a\"b"
          '\n'
          "a\tb"
          f'\u'
          """abc"""
          '''a''"''"'''
          """abc\xdef"""
          '''abc\$def'''
          '''abc\''''`
          .split("\n")
          .map((s) => s.trim()),
        bad: String.raw`
          'hello"
          "world'
          'a'b'
          "a"b"
          'a''
          "x""
          """"""""
          frr"abc"
          'a\'
          '''abc''''
          """`
          .split("\n")
          .map((s) => s.trim()),
      },
    }
    
    for (let name of Object.keys(testFixture)) {
      describe(`when matching ${name}`, () => {
        for (let s of testFixture[name].good) {
          it(`accepts ${s}`, () => {
            assert.ok(matches(name, s))
          })
        }
        for (let s of testFixture[name].bad) {
          it(`rejects ${s}`, () => {
            assert.ok(!matches(name, s))
          })
        }
      })
    }
    
  2. Here is a description of a language. Programs in this language are made up of a possibly empty sequence of function declarations, followed by a single expression. Each function declaration starts with the keyword func followed by the function’s name (an identifier), then a parenthesized list of zero or more parameters (also identifiers) separated by commas, then the body, which is a sequence of one or more expressions separated (NOT terminated) by semicolons with the expression sequence terminated with the keyword end. Expressions can be numeric literals, string literals, identifiers, function calls, or can be made up of other expressions with the usual binary arithmetic operators (plus, minus, times, divide) and a unary prefix negation and a unary postfix factorial (!). There’s a conditional expression with the syntax y if x else z. Factorial has the highest precedence, followed by negation, the multiplicative operators, the additive operators, and finally the conditional. Parentheses are used, as in most other languages, to group subexpressions. Numeric literals are non-empty sequences of decimal digits with an optional fractional part and an optional exponent part. String literals delimited with double quotes with the escape sequences \', \", \n, \\, and \u{hhhhhh} where xxxxxx is a sequence of one-to-six hexadecimal digits. Identifiers are non-empty sequences of letters, decimal digits, underscores, at-signs, and dollar signs, beginning with a letter or at-sign, that are not also reserved words. Function calls are formed with an identifier followed by a comma-separated list of expressions bracketed by square brackets. Comments are -- until the end of the line.

    Write a complete syntactic description of this language in Ohm. (Hint: use the Ohm editor to check your work. When grading, I will copy-paste your submitted solution to BrightSpace into the Ohm editor with my own detailed test suite).

  3. Draw syntax diagrams for the phrase rules (not the lexical rules) for the language in the previous problem.

For Your Project

Continue your compiler project in the public GitHub repository you created in the first assignment. You will be expanding your repo to have the following:

  .
  ├── .gitignore
  ├── README.md
  ├── LICENSE
  ├── package.json            -- configuration because this is a Node.js app
  ├── .prettierrc.json        -- (optional, you don’t have to have one)
  ├── docs
  │   └── ...                 -- image file for your logo
  ├── examples
  │   └── ...                 -- lots of example programs
  ├── src
  │   ├── (yourlanguagename).js
  │   ├── (yourlanguagename).ohm
  │   ├── compiler.js         -- can be copied+modified from Bella
  │   ├── core.js             -- this will be empty
  │   ├── parser.js           -- can be copied+modified from Bella
  │   ├── analyzer.js         -- contains just the stub function
  │   ├── optimizer.js        -- contains just the stub function
  │   └── generator.js        -- contains just the stub function
  └── test
      ├── compiler.test.js    -- can be copied from Bella, but has to be scaled back
      └── parser.test.js      -- new, test Ohm parsing here

Your tasks for this assignment are the following:

Grading Rubric

To help you get a good score, here are all the things you will be graded on.