The Language Jax

The Jax language was made up by students in a compiler class in 2003.

0 Introduction

Jax (a Java almost-xubxet) is an imperative, object oriented programming language. Its syntax is almost a pure subset of Java; differences are due to the fact that Jax leaves out many features of Java while throwing in a couple nice features of its own. Notable features of Jax include:

This document defines the language Jax.

1  Microsyntax

The source of a Jax program is a sequence of Unicode characters. Comments start with "//" and extend to the end of the line. Whitespace is any sequence of one or more characters with codepoints in the set {9, 10, 13, 32}. Tokens are formed by successively taking the longest substring that makes a valid token. Whitespace and comments always separate tokens.

Identifiers are nonempty strings of letters, decimal digits and underscores beginning with a letter, except for the following reserved words:

    interface   class     extends        implements   public   static
    final       boolean   char           int          double   string
    void        break     return         if           else     while
    for         in        synchronized   true         false    null
    this        new

Identifiers and reserved words are case sensitive. Integer literals are nonempty strings of digits. Floating point literals are described with the following regular expression:

    digit+ '.' digit* [('e'|'E') ['+'|'-'] digit+]

String literals are sequences of zero or more printable characters, spaces or escape sequences, delimited by double quotes. The escape sequences are:

\nnewline
\ttab
\xxxxxxxx;where xxxxxxxx is a one to eight character hexadecimal digit sequence, this escape sequence stands for a character with a given codepoint.
\"the double quote character
\'the single quote character
\\the backslash character

A character literal is a character or escape sequence surrounded by single quotes. Note that the escape sequences must be resolved during lexical analysis. Neither string literals nor character literals may extend across a line break. Integer literals, float literals, character literals, string literals, and identifiers are all tokens.

2  Macrosyntax

We give the Macrosyntax for Jax in EBNF. Brackets denote optional items, curly braces denote items that appear zero or more times, and vertical bars separate alternatives. Reserved words appear in lower case.

  PROGRAM      →  {UNIT}
  UNIT         →  INTERFACE | CLASS
  INTERFACE    →  interface ID [extends ID {',' ID}] '{' {METHODSIG ';'} '}'
  CLASS        →  class ID [implements ID {',' ID}] '{' {MEMBER} '}'
  MEMBER       →  FIELD | METHOD | CONSTRUCTOR
  FIELD        →  [public] [static] [final] TYPE ID ['=' EXP] ';'
  METHOD       →  [public] [static] METHODSIG BLOCK
  METHODSIG    →  (TYPE | void) ID '(' PARAMS ')'
  CONSTRUCTOR  →  [public] ID '(' PARAMS ')' BLOCK
  PARAMS       →  [TYPE ID {',' TYPE ID}]
  TYPE         →  (boolean | char | int | double | string | ID) {'[' ']'}
  BLOCK        →  '{' {STMT} '}'
  STMT         →  TYPE ID = EXP ';'
               |  EXP ';'
               |  VAR = EXP ';'
               |  break ';'
               |  return [EXP] ';'
               |  if '(' EXP ')' BLOCK {else if '(' EXP ')' BLOCK} [else BLOCK]
               |  while '(' EXP ')' BLOCK
               |  for '(' [TYPE ID = EXP] ';' [EXP] ';' [EXP] ')' BLOCK
               |  for '(' ID in EXP ')' BLOCK
               |  synchronized '(' EXP ')' BLOCK
  EXP          →  EXP0 {'?' EXP0 ':' EXP0}
  EXP0         →  EXP1 {'||' EXP1}
  EXP1         →  EXP2 {'&&' EXP2}
  EXP2         →  EXP3 {'|' EXP3}
  EXP3         →  EXP4 {'^' EXP4}
  EXP4         →  EXP5 {'&' EXP5}
  EXP5         →  EXP6 [RELOP EXP6]
  EXP6         →  EXP7 {SHIFTOP EXP7}
  EXP7         →  EXP8 {ADDOP EXP8}
  EXP8         →  EXP9 {MULOP EXP9}
  EXP9         →  [PREFIXOP] EXP10
  EXP10        →  LITERAL
               |  VAR
               |  INCDECOP VAR
               |  VAR INCDECOP
               |  new ID '(' ARGS ')'
               |  new TYPE ('[' EXP ']')+
               |  new TYPE ['{' ARGS '}']
               |  '(' EXP ')'
  LITERAL      →  null
               |  true
               |  false
               |  INTLIT
               |  FLOATLIT
               |  CHARLIT
               |  STRINGLIT
  VAR          →  [VARPREFIX] ID ['(' ARGS ')'] {VARSUFFIX}
  VARPREFIX    →  this '.' |  ID '::'
  VARSUFFIX    →  '[' EXP ']' | '.' ID ['(' ARGS ')']
  ARGS         →  [EXP {, EXP}]
  RELOP        →  '<' | '<=' | '==' | '!=' | '>=' | '>'
  SHIFTOP      →  '<<' | '>>'
  ADDOP        →  '+' | '-'
  MULOP        →  '*' | '/' | '%'
  PREFIXOP     →  '-' | '!' | '~' | '#' | '$'
  INCDECOP     →  '++' | '--'

3  Semantics

We give the semantics of Jax informally.

3.1  Programs and Units

A Jax program is a collection of units that includes the units from the standard library. A unit is a class or an interface. Classes and interfaces are similar to their Java counterparts, with some notable limitations.

Interfaces contain method signatures only. Each of the signatures are implicitly public and non-static.

Fields marked final cannot be modified at all; that is, there are no blank finals as in Java.

Classes cannot extend other classes, they can only implement interfaces. Classes may not be abstract; it is an error for a class that is declared to implement interfaces to fail to implement every method in those interfaces.

3.2  Declarations

A declaration binds an identifier to an entity. There are five types of declarations:

3.2.1 Scope

Each occurrence of an identifier is either a defining occurrence or a using occurrence. Using occurrences are legal only in the identifier's scope. The scope is determined as follows:

3.2.2 Uniqueness

The following rules restrict the choices for identifiers:

3.3  Types

Jax features the following types:

The types int and double are called the arithmetic types; the type string together with array types, class types and interface types, are called the reference types.

3.4  Blocks

Blocks are used to control the scope of variable declarations. A block consists of zero or more statements.

3.5  Variables

A variable is something that stores a value. All variables have a type. The kinds of variables are:

3.6  Statements

A statement is code that is executed solely for its side effect; it produces no value. The kinds of statements are:

3.7  Expressions

Each expression has a type and a value. The value of an expression with a reference type is either null or a reference to an object. Arrays and objects are therefore never manipulated directly, but only through references.

An expression e is type-compatible with a type t if and only if

An expression of type int can appear anywhere an expression of type double is expected; in this case the integer value is implicitly converted to one of type double. The conversion must maintain the expression's value; this is always possible since the type double has 53 bits of precision.

The signature of a method or constructor refers to the number, type, and order of its parameters, for example, if a method or constructor f is declared

f(t1 p1, t2 p2, t3 p3)

then its signature is the type list (t1, t2, t3). Constructors and methods can be declared with the ellipsis '...'; for example,

f(t1 p1, t2 p2, t3 p3, ...)

has signature (t1, t2, t3, MORE). An expression list (e1, ..., en) is said to match a signature (t1, ..., tk) if (1) n=k and each ei is type-compatible (see Section 3.8) with ti, or (2) tk=MORE and k-1<=n and each of e1 through e[k-1] are type compatible with t1 through t[k-1].

The Jax expressions are as follows. Note the semantics given to operators here refers only to the built-in (non-overloaded) behavior of the operator.

4  Standard Library

The following classes are assumed to exist in the runtime environment of every Jax program.

4.1  The Text class

class Text {
    public static int codepoint(char c) {...}
    public static char character(int i) {...}
    public static int indexOf(string s, char c) {...}
    public static char charAt(string s, int i) {...}
    public static string substring(string s, int start, int length) {...}
    public static int parseInt(String s) {...}
    public static double parseDouble(String s) {...}
    public static boolean useLocalizationsFrom(Stream s) {...}
    public static string format(String s, ...) {...}
}
codepoint(c)
Returns the codepoint of character c.
character(i)
Returns the character whose codepoint is i.
indexOf(s, c)
Returns the index of the first occurrence of c within s, or -1 if c does not occur within s.
charAt(s, i)
Returns the character at position i within s.
string substring(s, start, length)
Returns the string consisting of the length characters of s starting at startIndex. If startIndex is beyond the end of s, returns the empty string. If length is too large, then the returned string consists only of the characters up to the end of s.
parseInt(s)
Returns the integer that s represents.
parseDouble(s)
Returns the double that s represents.
useLocalizationsFrom(s)
Reads text from stream s, which must be a sequence of lines of the form k=v, and makes all these pairs comprise the current localization dictionary used by Text.format().
format(s, ...)
Simlar to sprintf() in C, except that the format string is a localization key, and the created string is returned from the method, rather than updated through a pointer argument. If s is not a key in the current localization dictionary, then it is used directly as the format string.

4.2  The Math class

class Math {
    public static final double PI = ...;
    public static double sqrt(double x) {...}
    public static double sin(double x) {...}
    public static double cos(double x) {...}
    public static double atan(double x, double y) {...}
    public static double ln(double x) {...}
}
PI
The value of π
sqrt(x)
The square root of x.
sin(x)
The sine of x.
cos(x)
The cosine of x.
atan(x, y)
ln(x)
The natural logarithm of x.

4.3  The Io class

class Io {
    public static int printf(String format, ...) {...}
    public static final Stream STDIN = ...;
    public static final Stream STDOUT = ...;
}
printf(format, ...)
Convenient shorthand for Io.STDOUT.printf().
STDIN
The standard input stream.
STDOUT
The standard output stream.

4.4  The Stream class

class Stream {
    public Stream forFile(String filename, String mode) {...}
    public Stream forSocket(Socket socket, String mode) {...}
    int read() {...}
    char readChar() {...}
    public ByteArray read(int count) {...}
    public string readLine() {...}
    public int write(ByteArray bytes) {...}
    public void write(String s) {...}
    public int printf(String format, ...) {...}
    public boolean close() {...}
}
forFile(filename, mode)
Returns a stream associated with the file with the given name. Mode is "w" for write-only or "r" for read-only.
forSocket(socket, mode)
Returns a stream associated with this socket. Mode is "w" for write-only or "r" for read-only. Returns null if the socket has not established a connection, or is a listening socket.
read()
Reads the next octet from this stream. Blocks until an octet is available. Returns the octet in the lower eight bits of its result (with the upper 24 bits clear), or returns -1 if the stream is not open.
char readChar()
Returns the next character to be read from standard input, or \ffffffff; if there are no characters remaining. This is a blocking call.
read(count)
Reads at most the requested number of octets from the available octets on this stream. Returns the number of octets actually read, which may be less than the amount requested.
string readLine()
Reads characters from the stream up to and including the first newline character, or until the end of the input file is reached. Returns a string consisting of all consumed characters not including the newline character. Octets are converted to characters according to the default character encoding. Returns null if the end of file had previously been reached. This is a blocking call.
write(bytes)
Writes the bytes from the specified array to this stream.
write(s)
Writes the given string to this stream.
printf(format ...)
Same as fprintf in C.
close()
Closes this stream.

4.5  The Runnable interface

interface Runnable {
    void run();
}

4.6  The Thread class

class Thread {
    public static Thread start(Runnable r) {...}
    public static Thread currentThread() {...}
    public static void sleep(int millis) {...}
    public void interrupt() {...}
    public boolean isInterrupted() {...}
}
start(r)
Starts a new thread on which to run r's run() method. Returns a reference to this new thread.
currentThread()
Returns a reference to the currently executing thread.
sleep(millis)
Causes the current thread to sleep for the specified number of milliseconds.
interrupt()
Sets the interrupted status of this thread to true.
isInterrupted()
Returns the interrupted status of this thread.

4.7  The Socket class

class Socket {
    public static Socket createListener(int port) {...}
    public Socket accept() {...}
    public static Socket createClientFor(int inetAddress, int port) {...}
    public void close() {...}
}
createListener(port)
Creates a listening TCP socket on the specified port. Returns null if another socket is already bound to that port.
accept()
A blocking call that returns, for this listening socket, a new socket to communicate with the client that has just connected.
createClientFor(inetAddress, port)
Returns a new socket connected to the socket at the specified remote IP address and port. Returns null if a connection cannot be established.
close()
Closes this socket.

5  Programs

Unlike Java, Jax does not require a dynamic runtime system with class loaders and the potential for a NoSuchMethodFoundError to be thrown. Instead, Jax code is intended to be compiled and linked into a standard executable file, which a host operating system can run in the usual way.

At link time, a class containing a public static void method called main, taking a single argument of type String[] must be specified; this specifies the entry point of the executable program. The operating system passes command line arguments to this method.

The Jax language definition does not specify the manner in which source code (a sequence of characters) is presented to a compiler. All that is required is that the compiler see some set of classes and interfaces. An implementation may require all units to appear in a single soruce file; other implementations may allow separate compilation of classes and interfaces. Compilers that accept separately complied units must specify some mechanism to handle units not defined in the current file being compiled; for example, if a compiler encounters a reference to a class C not in the current source file, it might look for a file called C.jax and compile that, returning to the original file when C has been compiled. Some safeguard must be built into this mechanism to prevent circular references from causing the compiler to enter an infinite loop. A compiler may employ a more sophisticated approach, looking not only for C.jax but also C.o (or C.obj). If the date on the object file is newer than that of the source file, a compiler might assume the foreign class is already compiled, and extract interface information from the object file. Mechanisms for doing this are compiler-specific and not part of the Jax language specification.