Jax (a Java almost-xubxet) is an imperative, object oriented programming language. Its syntax is almost a pure subset of Java; differences are due to the fact that Jax leaves out many features of Java while throwing in a couple nice features of its own. Notable features of Jax include:
printf
in its standard librarylength
('#') and toString
('$')
operatorsThis document defines the language Jax.
The source of a Jax program is a sequence of Unicode characters.
Comments start with "//
" and extend to the end of the line.
Whitespace is any sequence of one or more characters with codepoints
in the set {9, 10, 13, 32}. Tokens are formed by successively taking
the longest substring that makes a valid token. Whitespace and comments
always separate tokens.
Identifiers are nonempty strings of letters, decimal digits and underscores beginning with a letter, except for the following reserved words:
interface class extends implements public static final boolean char int double string void break return if else while for in synchronized true false null this new
Identifiers and reserved words are case sensitive. Integer literals are nonempty strings of digits. Floating point literals are described with the following regular expression:
digit+ '.' digit* [('e'|'E') ['+'|'-'] digit+]
String literals are sequences of zero or more printable characters, spaces or escape sequences, delimited by double quotes. The escape sequences are:
\n | newline |
\t | tab |
\xxxxxxxx; | where xxxxxxxx is a one to eight character hexadecimal digit sequence, this escape sequence stands for a character with a given codepoint. |
\" | the double quote character |
\' | the single quote character |
\\ | the backslash character |
A character literal is a character or escape sequence surrounded by single quotes. Note that the escape sequences must be resolved during lexical analysis. Neither string literals nor character literals may extend across a line break. Integer literals, float literals, character literals, string literals, and identifiers are all tokens.
We give the Macrosyntax for Jax in EBNF. Brackets denote optional items, curly braces denote items that appear zero or more times, and vertical bars separate alternatives. Reserved words appear in lower case.
PROGRAM → {UNIT} UNIT → INTERFACE | CLASS INTERFACE → interface ID [extends ID {',' ID}] '{' {METHODSIG ';'} '}' CLASS → class ID [implements ID {',' ID}] '{' {MEMBER} '}' MEMBER → FIELD | METHOD | CONSTRUCTOR FIELD → [public] [static] [final] TYPE ID ['=' EXP] ';' METHOD → [public] [static] METHODSIG BLOCK METHODSIG → (TYPE | void) ID '(' PARAMS ')' CONSTRUCTOR → [public] ID '(' PARAMS ')' BLOCK PARAMS → [TYPE ID {',' TYPE ID}] TYPE → (boolean | char | int | double | string | ID) {'[' ']'} BLOCK → '{' {STMT} '}' STMT → TYPE ID = EXP ';' | EXP ';' | VAR = EXP ';' | break ';' | return [EXP] ';' | if '(' EXP ')' BLOCK {else if '(' EXP ')' BLOCK} [else BLOCK] | while '(' EXP ')' BLOCK | for '(' [TYPE ID = EXP] ';' [EXP] ';' [EXP] ')' BLOCK | for '(' ID in EXP ')' BLOCK | synchronized '(' EXP ')' BLOCK EXP → EXP0 {'?' EXP0 ':' EXP0} EXP0 → EXP1 {'||' EXP1} EXP1 → EXP2 {'&&' EXP2} EXP2 → EXP3 {'|' EXP3} EXP3 → EXP4 {'^' EXP4} EXP4 → EXP5 {'&' EXP5} EXP5 → EXP6 [RELOP EXP6] EXP6 → EXP7 {SHIFTOP EXP7} EXP7 → EXP8 {ADDOP EXP8} EXP8 → EXP9 {MULOP EXP9} EXP9 → [PREFIXOP] EXP10 EXP10 → LITERAL | VAR | INCDECOP VAR | VAR INCDECOP | new ID '(' ARGS ')' | new TYPE ('[' EXP ']')+ | new TYPE ['{' ARGS '}'] | '(' EXP ')' LITERAL → null | true | false | INTLIT | FLOATLIT | CHARLIT | STRINGLIT VAR → [VARPREFIX] ID ['(' ARGS ')'] {VARSUFFIX} VARPREFIX → this '.' | ID '::' VARSUFFIX → '[' EXP ']' | '.' ID ['(' ARGS ')'] ARGS → [EXP {, EXP}] RELOP → '<' | '<=' | '==' | '!=' | '>=' | '>' SHIFTOP → '<<' | '>>' ADDOP → '+' | '-' MULOP → '*' | '/' | '%' PREFIXOP → '-' | '!' | '~' | '#' | '$' INCDECOP → '++' | '--'
A Jax program is a collection of units that includes the units from the standard library. A unit is a class or an interface. Classes and interfaces are similar to their Java counterparts, with some notable limitations.
Interfaces contain method signatures only. Each of the signatures are implicitly public and non-static.
Fields marked final
cannot be modified at
all; that is, there are no blank finals as in Java.
Classes cannot extend other classes, they can only implement interfaces. Classes may not be abstract; it is an error for a class that is declared to implement interfaces to fail to implement every method in those interfaces.
A declaration binds an identifier to an entity. There are five types of declarations:
Each occurrence of an identifier is either a defining occurrence or a using occurrence. Using occurrences are legal only in the identifier's scope. The scope is determined as follows:
The scope of an identifier declared in a class or interface declaration is the maximum possible scope.
The scope of an identifier declared as a member is the class body in which it was declared if undecorated, or equal to the scope of the class if marked public.
The scope of an identifier declared in a local variable declaration begins immediately after the declaration and extends to the end of its innermost enclosing block.
The scope of an identifier declared in a parameter declaration is the block of the method in which the parameter declaration appears.
The scope of an identifier declared as a for-statement index is (1) for the for statement with the three-part header, the two expressions in the for-statement specifier and the block of the for-statement, or (2) for the "for-in" statement, only its block.
The following rules restrict the choices for identifiers:
Jax features the following types:
The types int and double are called the arithmetic types; the type string together with array types, class types and interface types, are called the reference types.
Blocks are used to control the scope of variable declarations. A block consists of zero or more statements.
A variable is something that stores a value. All variables have a type. The kinds of variables are:
i
Here i is a simple identifier, which must denote a local-variable, parameter, for-index, or a field of the class in which this variable reference appears. The type of this variable is the type given to the field, parameter, local or for-index in its declaration. Local variables and parameters are always writeable, for-index variables are always read-only, and fields are read-only id and only if they are marked final in their declaration.
this
The variable expression this may only appear within a non-static method; it denotes a read-only variable whose value is a reference to the object on which the method was called.
m(a1, ..., an)
A read-only variable denoting the returned value from a method call. Here m must denote a method of the class in which the reference appears.
v[e]
Here v is a variable of an array type and e is an expression of type int. The type of this variable is a's base type. This variable is the array component at (zero-based) index e. The variable is not read-only.
v.f
Here v is a variable of a class type and f must be an identifier declared as a non-static field of v's type. This variable refers to the f-field of the object referred to by v. The type of this variable is the type associated with the field f. The variable is read-only if and only if the field is declared final.
v.m(a1, ...,
an)
Here v is a variable of a class type and m must be a non-static method declared in in the type of v. This is a read-only variable denoting the returned value from calling m with arguments a1 through an.
C::f
Here C is a class and f must be an identifier declared as a static field of C. This variable refers to the f-field of the class C. The type of this variable is the type associated with the field f. The variable is read-only if and only if the field is declared final.
C::m(a1, ...,
an)
Here C is a class or interface and m must be a static method declared in C. This is a read-only variable denoting the returned value from calling m with arguments a1 through an.
A statement is code that is executed solely for its side effect; it produces no value. The kinds of statements are:
t i = e;
The variable declaration. e is evaluated and then a new local variable i of type t is declared and initialized with the value of e.
e;
The expression statement. e is evaluated and its value is ignored.
v = e;
The assignment statement. e must be type compatible with the type of v and v must be writable (in other words, not read-only). v is determined and e is evaluated, then the value of e is copied into v.
break;
The break statement. This statement may only appear within a while or for statement. The break terminates the execution of the innermost enclosing while or for statement.
return;
The return statement.
Causes an immediate return from the enclosing method or constructor.
If a method, the method must have been marked void
in its
declaration.
return e;
The return statement. Evaluates e then causes the innermost enclosing method to immediately return the value of e. The method must have a return type, and e must be type compatible with it.
if (e1) b1 else if (e2) b2 else if (e3) b3 ... else bn
The if statement. Each ei must have type boolean. Each ei is evaluated in order from left to right until one of them is true or they have all been evaluated. If any of the ei's evaluate to true the the corresponding bi is executed, completing the execution of the if-statement. If none of the ei's evaluate to true, bn is executed (if it exists).
while (e) b
The while-statement. e must have type boolean. First e is evaluated. If e is false the execution of the while statement terminates. If e is true, b is executed then the while-statement is executed again.
for (t i = e1; e2; e3) b
The for-statement. This is equivalent to {t i = e1; while (e2) {b; e3}}
for (i in a) b
The array iteration statement. Here a must be an expression with an array type. This statement declares a new variable i whose scope is b. It executes b first with i set to a[0], then a[1] and so one for each value in a. The expression a is evaluated only once, at the beginning of the execution of the array iteration statement. The variable i may not be modified within b.
synchronized (e) b
Here e must evaluate to an object. If e is unlocked, locks e, executes b then unlocks e. If e is already locked, blocks until e is unlocked. Threads queue on blocked objects in FIFO fashion.
Each expression has a type and a value. The value of an expression with a reference type is either null or a reference to an object. Arrays and objects are therefore never manipulated directly, but only through references.
An expression e is type-compatible with a type t if and only if
An expression of type int can appear anywhere an expression of type double is expected; in this case the integer value is implicitly converted to one of type double. The conversion must maintain the expression's value; this is always possible since the type double has 53 bits of precision.
The signature of a method or constructor refers to the number, type, and order of its parameters, for example, if a method or constructor f is declared
f(t1 p1, t2 p2, t3 p3)
then its signature is the type list (t1, t2, t3). Constructors and methods can be declared with the ellipsis '...'; for example,
f(t1 p1, t2 p2, t3 p3, ...)
has signature (t1, t2, t3, MORE). An expression list (e1, ..., en) is said to match a signature (t1, ..., tk) if (1) n=k and each ei is type-compatible (see Section 3.8) with ti, or (2) tk=MORE and k-1<=n and each of e1 through e[k-1] are type compatible with t1 through t[k-1].
The Jax expressions are as follows. Note the semantics given to operators here refers only to the built-in (non-overloaded) behavior of the operator.
An integer literal, which has type int.
A character literal, which has type char.
A floating point literal, which has type double.
A string literal, which has type string.
true
The literal of type boolean denoting truth.
false
The literal of type boolean denoting falsity.
null
A literal representing a reference to no object, and whose actual type depends on its context. Technically, every reference type t contains a value nullt.
v
Where v is a variable. The type of this expression is the type of the variable v, and the value of this expression is the current value stored in v.
v++
v must have type int. Produces the value of v, but increments v immediately after producing the value.
v--
v must have type int. Produces the value of v, but decrements v immediately after producing the value.
++v
v must have type int. Increments v, then produces this value.
--v
v must have type int. Decrements v, then produces this value.
new t (e1, e2, ..., en)
Here t names a class that has a constructor with a signature matched by the argument list e1 through en. Calls the constructor with the arguments copied to the parameters and produces a reference to the newly constructed object.
new t [e1][e2]...[en]
Produces a reference to a new array object of type t[][]...[] (an "n-dimensional array") with the specified number of components in each dimension. The values of the newly constructed object are undefined.
new t {e1, e2, ..., en}
Here t must be an array type. Produces a reference to a new array object with values e1 through en.
f(e1, ..., en)
f must name a method whose signature is matched by the argument list e1 through en. Each expression is evaluated in any order and the method f is called with the arguments copied to the parameters. If the method was marked with a type in its declaration, this expression produces the returned value from the method. Otherwise this expression produces no value. It is possible to think of such an expression producing a value of the pseudo type "void" which is not type-compatible with any type, not even itself.
(e)
Evaluates e and produces this value.
-e
e must have type an arithmetic type. Evaluates e and produces the negation of e.
~e
e must have type int. Evaluates e and produces the bitwise complement of e.
!e
e must have type boolean. If e evaluates to true, the entire expression produces false, otherwise it produces true.
#e
e must have be an array or a string. Produces the number of items if an array, or the number of characters if a string.
$e
e can be any expression
whatsoever. Produces a string representation of its
argument. For ints, chars, doubles and strings the produced
string is identical to the output of printf
with the
%d, %c, %s and %f format specifiers, respectively.
For booleans, the produced string is either "true" or
"false". Arrays produce strings of the
form [e1, e2, ..., en] where each
ei is the result of applying $ to
the elements of the object. For objects of classes,
the produced string is equal to calling the method
public string toString()
if such a method exists;
if it does not, the produced string is the classname followed
by "@" followed by some hexadecimal digits.
e1 * e2
Both subexpressions must have arithmetic type. The subexpressions are evaluated in any order and their product is produced.
e1 / e2
Each subexpression must have an arithmetic type. Both expressions are evaluated, in any order, and the entire expression produces the quotient of e1 divided by e2. The type of the quotient is double only if either operand is double, otherwise the type is int.
e1 % e2
Each subexpression must have type int. Both expressions are evaluated, in any order, and the entire expression produces an int which is the modulo of e1 and e2.
e1 + e2
Either both subexpressions must have arithmetic type, or both must have string type. In the former case, the subexpressions are evaluated in any order and their sum is produced. In the latter, the subexpressions are evaluated in any order and the (left-to-right) concatenation of the strings is produced.
e1 - e2
Each ei must have an arithmetic type. Evaluates the subexpressions in any order, then produces the difference of e1 and e2.
e1 << e2
Each ei must have type int. Produces the value of e1 shifted left e2 positions.
e1 >> e2
Each ei must have type int. Produces the value of e1 arithmetically shifted right e2 positions.
e1 <= e2
Each subexpression must have arithmetic type, or must both be chars. Both expressions are evaluated, in any order, and the entire expression produces whether the value of e1 is less than or equal to the value of e2.
e1 < e2
Each subexpression must have arithmetic type, or must both be chars. Both expressions are evaluated, in any order, and the entire expression produces whether the value of e1 is less than the value of e2.
e1 == e2
e1 must be type-compatible with the type of e2 or e2 must be type compatible with the type of e1. The subexpressions are evaluated in any order, and the entire expression produces whether these values are the same, taking into account any automatic conversions.
e1 != e2
Equivalent to !(e1==e2).
e1 > e2
Each subexpression must have arithmetic type, or must both be chars. Both expressions are evaluated, in any order, and the entire expression produces whether the value of e1 is greater than the value of e2.
e1 >= e2
Each subexpression must have arithmetic type, or must both be chars. Both expressions are evaluated, in any order, and the entire expression produces whether the value of e1 is greater than or equal to the value of e2.
e1 & e2
Each subexpression must have type int. Both expressions are evaluated, in any order, and the entire expression produces an int which is the bitwise and of e1 and e2.
e1 ^ e2
Each subexpression must have type int. Both expressions are evaluated, in any order, and the entire expression produces an int which is the bitwise exclusive or of e1 and e2.
e1 | e2
Each subexpression must have type int. Both expressions are evaluated, in any order, and the entire expression produces an int which is the bitwise inclusive or of e1 and e2.
e1 && e2
Each subexpression must have type boolean. First e1 is evaluated. If it evaluates to false, the entire expression immediately produces false (without evaluating e2). Otherwise e2 is evaluated and the entire expression produces the value of e2.
e1 || e2
Each subexpression must have type boolean. First e1 is evaluated. If it evaluates to true, the entire expression immediately produces true (without evaluating e2). Otherwise e2 is evaluated and the entire expression produces the value of e2.
e1 ? e2 : e3
Here e1 must have type boolean, and e2 and e3 must be of the same type. First e1 is evaluated. If it evaluates to true, the entire expression evaluates and produces e2, otherwise it evaluates and produces e3.
The following classes are assumed to exist in the runtime environment of every Jax program.
class Text { public static int codepoint(char c) {...} public static char character(int i) {...} public static int indexOf(string s, char c) {...} public static char charAt(string s, int i) {...} public static string substring(string s, int start, int length) {...} public static int parseInt(String s) {...} public static double parseDouble(String s) {...} public static boolean useLocalizationsFrom(Stream s) {...} public static string format(String s, ...) {...} }
Text.format()
.class Math { public static final double PI = ...; public static double sqrt(double x) {...} public static double sin(double x) {...} public static double cos(double x) {...} public static double atan(double x, double y) {...} public static double ln(double x) {...} }
class Io { public static int printf(String format, ...) {...} public static final Stream STDIN = ...; public static final Stream STDOUT = ...; }
Io.STDOUT.printf()
.class Stream { public Stream forFile(String filename, String mode) {...} public Stream forSocket(Socket socket, String mode) {...} int read() {...} char readChar() {...} public ByteArray read(int count) {...} public string readLine() {...} public int write(ByteArray bytes) {...} public void write(String s) {...} public int printf(String format, ...) {...} public boolean close() {...} }
fprintf
in C.interface Runnable { void run(); }
class Thread { public static Thread start(Runnable r) {...} public static Thread currentThread() {...} public static void sleep(int millis) {...} public void interrupt() {...} public boolean isInterrupted() {...} }
r
's run()
method. Returns a reference to this new thread.
true
.class Socket { public static Socket createListener(int port) {...} public Socket accept() {...} public static Socket createClientFor(int inetAddress, int port) {...} public void close() {...} }
Unlike Java, Jax does not require a dynamic runtime system with class loaders and the potential for a NoSuchMethodFoundError to be thrown. Instead, Jax code is intended to be compiled and linked into a standard executable file, which a host operating system can run in the usual way.
At link time, a class containing a public static void method
called main, taking a single argument of type String[]
must be specified; this specifies the entry point of the executable
program. The operating system passes command line arguments
to this method.
The Jax language definition does not specify the manner in which source code (a sequence of characters) is presented to a compiler. All that is required is that the compiler see some set of classes and interfaces. An implementation may require all units to appear in a single soruce file; other implementations may allow separate compilation of classes and interfaces. Compilers that accept separately complied units must specify some mechanism to handle units not defined in the current file being compiled; for example, if a compiler encounters a reference to a class C not in the current source file, it might look for a file called C.jax and compile that, returning to the original file when C has been compiled. Some safeguard must be built into this mechanism to prevent circular references from causing the compiler to enter an infinite loop. A compiler may employ a more sophisticated approach, looking not only for C.jax but also C.o (or C.obj). If the date on the object file is newer than that of the source file, a compiler might assume the foreign class is already compiled, and extract interface information from the object file. Mechanisms for doing this are compiler-specific and not part of the Jax language specification.