LMU ☀️ CMSI 3802
LANGUAGES AND AUTOMATA II
HOMEWORK #2 Due: 2025-02-21

Learning Objectives

In this assignment you will demonstrate:

Readings, Videos, and Self-Study

Although your team will be turning in only one submission, every team member is individually responsible for each of the following:

Submission Instructions

As usual, turn in a single submission for your entire group via BrightSpace. Remember to not simply partition the work among team members, as this reduces your learning.

Your team's BrightSpace submission should be an attached PDF or text file with:

Exercises

  1. Give tiny little Ohm grammars for:
    1. Canadian Postal Codes (make sure to prohibit D, F, I, O, Q, U)
    2. Legal Visa® Card Numbers, ignoring the Luhn checksums, i.e., accept 4 + 15 digits or 4 + 12 digits.
    3. Legal MasterCard® Numbers, ignoring the Luhn checksums, i.e., accept 51-55 + 14 digits or 2221-2720 + 12 digits.
    4. Strings of Basic Latin letters except those strings that are exactly three letters ending with two Latin letter o’s, of any case.
    5. Binary numerals divisible by 16.
    6. Decimal numerals in the range 8 through 32, inclusive.
    7. All strings of Unicode letters, except python, pycharm, or pyc.
    8. Floating point constants that are allowed to have an empty fractional part, but whose exponent part is required and can have no more than three digits in the exponent part
    9. Palindromes over the letters a, b, and c, of length 2, 3, 5, or 8
    10. Python string literals. Don't get too fancy here—just translate what you see in the Python Reference linked above into Ohm notation.

    Feel free to use the following template, which comes with some unit tests:

    import { describe, it } from "node:test"
    import assert from "assert"
    import * as ohm from "ohm-js"
    
    const grammars = {
      canadianPostalCode: String.raw`
        write your grammar here
      `,
    
      visa: String.raw`
        write your grammar here
      `,
    
      masterCard: String.raw`
        write your grammar here
      `,
    
      notThreeEndingInOO: String.raw`
        write your grammar here
      `,
    
      divisibleBy16: String.raw`
        write your grammar here
      `,
    
      eightThroughThirtyTwo: String.raw`
        write your grammar here
      `,
    
      notPythonPycharmPyc: String.raw`
        write your grammar here
      `,
    
      restrictedFloats: String.raw`
        write your grammar here
      `,
    
      palindromes2358: String.raw`
        write your grammar here
      `,
    
      pythonStringLiterals: String.raw`
        write your grammar here
      `,
    }
    
    function matches(name, string) {
      const grammar = `G {${grammars[name]}}`
      return ohm.grammar(grammar).match(string).succeeded()
    }
    
    const testFixture = {
      canadianPostalCode: {
        good: ["A7X 2P8", "P8E 4R2", "K1V 9P2", "Y3J 5C0"],
        bad: [
          "A7X   9B2",
          "C7E 9U2",
          "",
          "Dog",
          "K1V\t9P2",
          " A7X 2P8",
          "A7X 2P8 ",
        ],
      },
      visa: {
        good: ["4128976567772613", "4089655522138888", "4098562516243"],
        bad: [
          "43333",
          "42346238746283746823",
          "7687777777263211",
          "foo",
          "π",
          "4128976567772613 ",
        ],
      },
      masterCard: {
        good: [
          "5100000000000000",
          "5294837679998888",
          "5309888182838282",
          "5599999999999999",
          "2221000000000000",
          "2720999999999999",
          "2578930481258783",
          "2230000000000000",
        ],
        bad: [
          "5763777373890002",
          "513988843211541",
          "51398884321108541",
          "",
          "OH",
          "5432333xxxxxxxxx",
        ],
      },
      notThreeEndingInOO: {
        good: ["", "fog", "Tho", "one", "a", "ab", "food"],
        bad: ["fOo", "gOO", "HoO", "zoo", "MOO", "123", "A15"],
      },
      divisibleBy16: {
        good: [
          "0",
          "00",
          "000",
          "00000",
          "00000",
          "000000",
          "00000000",
          "1101000000",
        ],
        bad: ["1", "00000000100", "1000000001", "dog0000000"],
      },
      eightThroughThirtyTwo: {
        good: Array(25)
          .fill(0)
          .map((x, i) => i + 8),
        bad: ["1", "0", "00003", "dog", "", "361", "90", "7", "-11"],
      },
      notPythonPycharmPyc: {
        good: [
          "",
          "pythons",
          "pycs",
          "PYC",
          "apycharm",
          "zpyc",
          "dog",
          "pythonpyc",
        ],
        bad: ["python", "pycharm", "pyc"],
      },
      restrictedFloats: {
        good: ["1e0", "235e9", "1.0e1", "1.0e+122", "55e20"],
        bad: ["3.5E9999", "2.355e-9991", "1e2210"],
      },
      palindromes2358: {
        good: [
          "aa",
          "bb",
          "cc",
          "aaa",
          "aba",
          "aca",
          "bab",
          "bbb",
          "ababa",
          "abcba",
          "aaaaaaaa",
          "abaaaaba",
          "cbcbbcbc",
          "caaaaaac",
        ],
        bad: ["", "a", "ab", "abc", "abbbb", "cbcbcbcb"],
      },
      pythonStringLiterals: {
        good: String.raw`''
          ""
          'hello'
          "world"
          'a\'b'
          "a\"b"
          '\n'
          "a\tb"
          f'\u'
          """abc"""
          '''a''"''"'''
          """abc\xdef"""
          '''abc\$def'''
          '''abc\''''`
          .split("\n")
          .map((s) => s.trim()),
        bad: String.raw`
          'hello"
          "world'
          'a'b'
          "a"b"
          'a''
          "x""
          """"""""
          frr"abc"
          'a\'
          '''abc''''
          """`
          .split("\n")
          .map((s) => s.trim()),
      },
    }
    
    for (let name of Object.keys(testFixture)) {
      describe(`when matching ${name}`, () => {
        for (let s of testFixture[name].good) {
          it(`accepts ${s}`, () => {
            assert.ok(matches(name, s))
          })
        }
        for (let s of testFixture[name].bad) {
          it(`rejects ${s}`, () => {
            assert.ok(!matches(name, s))
          })
        }
      })
    }
    
  2. Here is a description of a language. Programs in this language are made up of a possibly empty sequence of function declarations, followed by a single expression. Each function declaration starts with the keyword func followed by the function’s name (an identifier), then a parenthesized list of zero or more parameters (also identifiers) separated by commas, then the body, which is a sequence of one or more expressions separated (NOT terminated) by semicolons with the expression sequence terminated with the keyword end. Expressions can be numeric literals, string literals, identifiers, function calls, or can be made up of other expressions with the usual binary arithmetic operators (plus, minus, times, divide) and a unary prefix negation and a unary postfix factorial (!). There’s a conditional expression with the syntax y if x else z. Factorial has the highest precedence, followed by negation, the multiplicative operators, the additive operators, and finally the conditional. Parentheses are used, as in most other languages, to group subexpressions. Numeric literals are non-empty sequences of decimal digits with an optional fractional part and an optional exponent part. String literals delimited with double quotes with the escape sequences \', \", \n, \\, and \u{hhhhhh} where hhhhhh is a sequence of one-to-six hexadecimal digits. Identifiers are non-empty sequences of letters, decimal digits, underscores, at-signs, and dollar signs, beginning with a letter or at-sign, that are not also reserved words. Function calls are formed with an identifier followed by a comma-separated list of expressions bracketed by square brackets. Comments are -- until the end of the line.

    Write a single example program that covers every aspect of this definition.

  3. For the language described in the previous exercise, write a complete syntactic description of this language in Ohm. (Hint: use the Ohm editor to check your work. Use your everything-program from the previous exercise as a positive test, but write a few negative tests, too, so that you can check that your grammar does not “match too much.” When grading, I will copy-paste your submitted solution into the Ohm editor with my own detailed test suite).

For Your Project

Continue your compiler project in the public GitHub repository you created in the first assignment. You will be expanding your repo to have the following:

  .
  ├── .gitignore
  ├── README.md
  ├── LICENSE
  ├── package.json
  ├── .prettierrc.json
  ├── docs
  │   └── ...                 -- image file for your logo
  ├── examples
  │   └── ...                 -- lots of example programs
  ├── src
  │   ├── (yourlanguagename).js
  │   ├── (yourlanguagename).ohm
  │   ├── compiler.js         -- copied from the course notes
  │   ├── core.js             -- this will be empty for now
  │   ├── parser.js           -- copied from the course notes
  │   ├── analyzer.js         -- contains just the stub function
  │   ├── optimizer.js        -- contains just the stub function
  │   └── generator.js        -- contains just the stub function
  └── test
      ├── compiler.test.js    -- just test that you can invoke the parser
      └── parser.test.js      -- new, test Ohm parsing here

Your tasks for this assignment are the following:

Grading Rubric

To help you get a good score, here are all the things you will be graded on.