ML was, long ago, a small programming language. Today, ML is the name for a family of languages that include Standard ML (a.k.a SML), Objective CAML (a.k.a OCaml), F#, LazyML, Alice, and Elm.
The original ML and its immediate descendants were never really widely used, but they have been enormously influential. Many popular languages have adopted several of ML’s amazing innovations.
You’ll find some good stuff here:
Standard ML:
More reasons: Read this overview of features.
The traditional “Hello, world” program:
print "Hello, World\n";
There are many implementations of Standard ML, including Moscow ML, Standard ML of New Jersey, MLton, Poly/ML, SML.NET, MLJ and ML Kit. These notes only cover Moscow ML.
You can run the code by dropping it into an interactive shell, or by using a compiler to produce executables.
Just type your declarations and expressions into the interactive shell and the system will respond. Here is an example session (assuming the file hello.sml
contains the code above):
$ mosml Moscow ML version 2.10 Enter `quit();' to quit. - 4; > val it = 4 : int - "hello " ^ " there"; > val it = "hello there" : string - fun fact n = if n = 0 then 1 else n * fact(n - 1); > val fact = fn : int -> int - fact 5; > val it = 120 : int - val x = 12 and y = ~5; > val x = 12 : int val y = ~5 : int - x - y * 2; > val it = 22 : int - use "hello.sml"; [opening file "hello.sml"] Hello, World > val it = () : unit [closing file "hello.sml"] > val it = () : unit - length [4, 3, 8, 7, 3]; > val it = 5 : int - quit();
Note how nice the system is.... If you don’t specify types, it figures them out for you (as long as what you wrote makes sense). If you type in an expression directly, it turns it into a declaration binding the result to the special name it
.
The Moscow ML compiler is called mosmlc
. It won’t compile our “Hello, World” program because it doesn’t like standalone expressions: it requires declarations:
val _ = print "Hello, World\n";
Then on Windows you can enter:
> mosmlc helloworld.sml -o helloworld.exe
and on Unix you can enter:
$ mosmlc helloworld.sml -o helloworldand you have your executable. To compile and run, of course, that would be:
> mosmlc helloworld.sml -o helloworld.exe && helloworldor:
$ mosmlc helloworld.sml && ./a.out
Here’s a slightly more interesting program, illustrating a bit more of the language:
(* * A little program that writes the command line arguments from last * to first, to standard output. It is longer than necessary because * it is a teaching example. *) (* Returns the reversal of a list. Warning: wheel reinvention. *) fun reverse [] = [] | reverse (h :: t) = reverse t @ [h]; fun concat_space s = s ^ " "; (* Prints each command line arg, suffixed with a space. *) val _ = let val args = CommandLine.arguments() in map (print o concat_space) (reverse args); print "\n" end;
A program is a sequence of declarations. Declarations begin with:
val
, to declare a new value binding
fun
, to declare a new function
type
, to introduce a type abbreviation
datatype
, to define a new type with visible constructors
abstype
, to define a new type with hidden constructors
exception
, to define a new exception constructor
signature
, to define a new signature
structure
, to define a new structure
functor
, to define a new function
infix
, to give infix, left-associative status to an operator
infixr
, to give infix, right-associative status to an operator
nonfix
, to declare an operator to be prefix
local
, to make a declaration that uses private local declarations
open
, to allow all items in a structure to be used without qualification
Most program entities (types, exceptions, values, etc.) are defined in structures. For example, the structure Timer
contains the type cpu_timer
and the functions startCPUTimer
and checkCPUTimer
, among other things, so you need to write Timer.cpu_timer
and Timer.checkCPUTimer
, etc. But entities in the package General
(which contains types like int
, string
, and real
and functions like <=
, not
, and mod
do not have to be qualified with the package name.
(* * Trivial example of timing. *) fun count(n) = if (n > 0) then count(n - 1) else (); val _ = let val t = Timer.startCPUTimer() in count(10000000); print (Time.toString(#usr(Timer.checkCPUTimer(t))) ^ "\n") end;
abstype and andalso as case do datatype else end eqtype exception fn fun functor handle if in include infix infixr let local nonfix of op open orelse raise rec sharing sig signature struct structure then type val where with withtype while ( ) [ ] { } , : :> ; ... _ | = => -> #
Identifiers are either:
'
) and underbars (_
) starting with a letter
or prime;
! % & $ # + - / : < = > ? @ \ ~ ' ^ | *
But reserved words are excluded. And notice no mixing of
alphanumeric and symbolic characters! You can’t have the identifier
|-o-|
, though you can have |-*-|
.
Between (*
and *)
and they can nest!
Here are some of the types:
Name | Examples | Notes |
---|---|---|
unit
| ()
| This is the only value of this type |
bool
| true
| These are the only two values of the type |
int
| 0
| Unary negation is a tilde, so as not to be confused with - , which denotes binary subtraction
|
word
| 0w0
| |
real
| 0.7
| Note the tilde for negation |
char
| #"s"
| |
string
| "hello"
| |
α -> β
| fn x => x + 1
| The type of functions taking an argument of type α and returning a value of type β |
record types | {name="Masha", age=25}
| A built-in type former |
α * β
| (4, "d", false)
| Abbreviation for records whose fields are numbers starting with 1 |
α list
| nil
| |
α option
| SOME 2
| |
exn
| Div
| This is a very special datatype because new constructors can be added at runtime |
α ref
| ref 2
| The gateway into mutable state |
α frag
| QUOTE "hello"
| Powerful stuff |
Standard ML, like all the MLs takes type inference to a whole new level, compared to languages like C#, Go, Rust, and Swift. Let’s learn by example. Watch how SML infers these types, and pay attention to the introduction of type variables (you’ll know them when you see them):
[true, true, false] (* bool list *) [] (* 'a list *) fn s => length s * 2 (* 'a list -> int *) fn x => (15.2, x) (* 'a -> real * 'a *) {x = 3, y = 5} (* {x : int, y : int} *) fn x => fn y => fn z => (z, x, y) (* 'a -> 'b -> 'c -> 'c * 'a * 'b *) fn (x, y) => x = y; (* ''a * ''a -> bool *) (fn s => SOME s)[] (* 'a list option *)
Types with a double apostrophe prefix are equality types. Equality doesn’t work on all types (what does it mean for two functions to be equal, right?). For example, the type int->int
will match (unify) with 'a
, but it will not unify with ''a
:
$ mosml Moscow ML version 2.10 Enter `quit();' to quit. - fun addZero x = (x, 0); > val 'a addZero = fn : 'a -> 'a * int - addZero(fn n => n * 3); > val it = (fn, 0) : (int -> int) * int - fun equal(x, y) = x = y; > val ''a equal = fn : ''a * ''a -> bool - equal(fn n => n * 3, fn n => n * 3); ! Toplevel input: ! equal(fn n => n * 3, fn n => n * 3) ! ^^^^^^^^^^ ! Type clash: match rule of type ! 'a -> 'b ! cannot have equality type ''c
Interested in the details of the SML type system? Read about the Hindley-Milner Type System. It’s pretty cool.
The type
declaration does not make a new type, just a type abbreviation:
type intpair = int * int; type person = {name: string, birthday: date, weight: real}; type simplefun = real -> real; type text = string; type intset = bool list; type coin_flipping_results = bool list;
Here coin_flipping_results
and intset
are exactly the same type.
You need to use datatype
or abstype
to create a new type, distinct from any other. Datatypes are defined with constructors:
datatype weekday = Monday | Tuesday | Wednesday | Thursday | Friday; datatype file_descriptor = Stdin | Stdout | Stderr | Other of int; datatype 'a tree = Empty | Node of 'a * 'a tree * 'a tree; datatype Shape = Circle of {radius: real} | Rectangle of {height: real, width: real} | Polygon of (real * real) list; abstype 'a stack = Data of 'a list with exception Underflow; val newStack = Data [] fun push (Data s) x = Data (x::s) fun pop (Data []) = raise Underflow | pop (Data (_::t)) = Data t fun top (Data []) = raise Underflow | top (Data (x::t)) = x fun listToStack s = Data s fun stackToList (Data s) = s end;
The abstract type is just a datatype that hides its constructors. But its probably better to use structures anyway.
Functions are just regular values who happen to have a type like 'a -> 'b
. Functions must have exactly one parameter and exactly one return type. The syntax of a function is:
fn pattern => exp ('|' pattern => exp)*
Function calls are straighforward:
(fn x => x + 6) 20; (fn [] => false | (h::t) => true) [4, 5, ~3]; (fn _ => 0) (4, "dog", [4.5, 1.1], (1,1)); (fn 5 => 0 | x => x) 5; (fn 5 => 0 | x => x) 12;
Note that each match is tried in order. Order matters, so be careful.
Because functions are values, you declare them just like any other value. You can take advantage of the fact that each declaration can use declarations that precede it:
val two = 2; val addSix = fn x => x + two + 4; val square = fn x:real => x * x; val magnitude = fn (x,y) => Math.sqrt(square x + square y); val rec fact = fn n => if n = 0 then 1 else n * fact (n-1);
You can use fun
for convenience (it’s basically val rec
):
fun addSix x = x + 6; fun square (x:real) = x * x; fun magnitude (x,y) = Math.sqrt(square x + square y); fun fact n = if n = 0 then 1 else n * fact (n-1);
You can use multiple patterns with fun
too; it sure makes writing recursive functions easy. And pattern-based definitions are much preferred over if-expressions.
fun fact 0 = n | fact n = n * fact(n - 1); fun pow (0, _) = 1.0 | pow (n, x) = x * pow(n - 1, x); fun size Empty = 0 | size (Node (x, left, right)) = 1 + size left + size right; fun preorder Empty = [] | preorder (Node (x, left, right)) = [x] @ preorder left @ preorder right;
The functions fn (x,y) => x + y
and fn x => fn y => x + y
are similar. The former is the uncurried version; the latter is curried. Curried functions allow “partial application”:
fun add x y = x + y; fun addSix = add 6; add 6 5; (* returns 11 *) addSix 5; (* also returns 11 *)
A higher order function is one that takes a function as a parameter and/or returns a function as its result. The curried plus function above is a higher order function, because it takes in an integer and returns a function.
Operators are really just functions that are given precedence and fixity in a infix
, infixr
or nonfix
directive. These are some of the infix operators defined in the standard library. (Note that many of the
built-in operators are overloaded, so the specification has “fake types” — wordint
means either int
, word
or word8
; num
means wordint
or real
; numtxt
means num
, char
or string
.)
Precedence | Id | Type | Description | Associativity |
---|---|---|---|---|
7 | * | num * num -> num | product | L |
/ | real * real -> real | floating-point quotient | L | |
div | wordint * wordint -> wordint | quotient (rounded toward -infinity) | L | |
mod | wordint * wordint -> wordint | remainder (of div) | L | |
6 | + | num * num -> num | sum | L |
- | num * num -> num | difference | L | |
^ | string * string -> string | concatenation | L | |
5 | :: | 'a * 'a list -> 'a list | push to head of list | R |
@ | 'a list * 'a list -> 'a list | append lists | R | |
4 | = | 'a * 'a -> bool | equal to | L |
<> | 'a * 'a -> bool | not equal to | L | |
< | numtxt * numtxt -> bool | less than | L | |
<= | numtxt * numtxt -> bool | less than or equal to | L | |
> | numtxt * numtxt -> bool | greater than | L | |
>= | numtxt * numtxt -> bool | greater than or equal to | L | |
3 | := | 'a ref * 'a -> unit | assignment | L |
o | ('b*'c) * ('a*'b) -> ('a*'c) | composition | L | |
0 | before | 'a * 'b -> 'a | return first argument | L |
And you can define your own:
fun |-*-| (x, y) = 2 * x + 3 * y; infix 5 |-*-|; 10 |-*-| 4; (* returns 32 *) 1 |-*-| 2 |-*-| 3; (* returns 25 *) 1 |-*-| (2 |-*-| 3); (* returns 41 *) 5 + 3 |-*-| 6 before 8; (* returns 34 *)
You can abuse the built-in operators:
- infix 2 *; > infix 2 * - 4 + 8 * 1 + 4; > val it = 60 : int - val op+ = op-; > val + = fn : int * int -> int - 5 + 2; > val it = 3 : int
Remember, useful programming languages provide modules for:
ML’s module system is pretty sophisticated. The basic element is the structure:
structure Set = struct datatype 'a set = Items of 'a list; val empty = Items []; fun add (Items xs) x = if List.exists (fn y => y = x) xs then Items xs else Items(x::xs); fun remove (Items xs) x = Items (List.filter (fn y => y <> x) xs); fun cardinality (Items xs) = length xs; fun contains (Items xs, x) = List.exists (fn y => y = x) xs; end;
The contents are accessible with dot-notation (no surprise, probably):
val s = Set.add Set.empty 7; val s = Set.add s 8; val s = Set.remove s 2; Set.cardinality s; Set.contains (s, 7); Set.Items [8,8,4]; (* ICK!! That should probably have been hidden *)
The previous structure exposes too much information. We can use a signature to restrict the view. We should write:
signature SET = sig type 'a set; val empty: 'a set; val add: ''a set -> ''a -> ''a set; val remove: ''a set -> ''a -> ''a set; val cardinality: 'a set -> int; val contains: ''a set * ''a -> bool; end;
Then we say:
structure Set :> SET = struct datatype 'a set = Items of 'a list; val empty = Items []; fun add (Items xs) x = if List.exists (fn y => y = x) xs then Items xs else Items(x::xs); fun remove (Items xs) x = Items (List.filter (fn y => y <> x) xs); fun cardinality (Items xs) = length xs; fun contains (Items xs, x) = List.exists (fn y => y = x) xs; end;
Now Set.Items
doesn’t exist! We can now even say:
signature READ_ONLY_SET = sig type 'a set; val empty: 'a set; val cardinality: 'a set -> int; val contains: ''a set * ''a -> bool; end; structure ReadOnlySet :> READ_ONLY_SET = Set;
now ReadOnlySet.empty
exists, but
ReadOnlySet.remove
does not.
If we change the set representation from a list to a binary search tree, we would have to specify how the set elements were to be ordered. This less-than relation is an input to the structure. The way to parameterize structures in SML is with functors:
functor MakeSet (eqtype t; val lt: t * t -> bool): sig type ''a set val empty: t set val add: t set -> t -> t set val remove: t set -> t -> t set val cardinality: t set -> int val contains: t set * t -> bool end = struct datatype ''a set = Empty | Node of ''a * ''a set * ''a set val empty = Empty; fun add Empty x = Node (x, Empty, Empty) | add (s as Node(y, left, right)) x = if (x = y) then s else if lt(x, y) then Node(y, add left x, right) else Node(y, left, add right x); fun remove s x = raise (Fail "remove not implemented"); fun cardinality Empty = 0 | cardinality (Node(_, left, right)) = 1 + cardinality left + cardinality right; fun contains (Empty, x) = false | contains (Node(y, left, right), x) = if lt(x, y) then contains(left, x) else if lt(y,x) then contains(right, x) else true; end;
Then:
structure IntSet = MakeSet(type t = int; val lt = op<); val s = IntSet.add IntSet.empty 8; IntSet.contains(s, 7); ...
The following structures comprise the Standard Basis Library
Structure | Description |
---|---|
Array | mutable constant-time-access arrays |
Array2 | two-dimensional arrays |
Binaryset | binary tree implementation of finite sets |
Bool | Booleans |
Byte | character-byte conversion |
Char | characters |
CharArray | arrays of characters |
CharVector | vectors of characters (= strings) |
CommandLine | program name and arguments |
Date | manipulation of calendar dates |
FileSys | interaction with the file system |
General | various top-level primitives |
Int | operations on integers |
List | classic list manipulation functions |
ListPair | operations on pairs of lists |
Math | trigonometric functions etc. |
OS | operating system information |
Option | partial functions |
Path | file-system independent path manipulation |
Process | manipulating processes |
Real | arithmetics on floating-point numbers |
Signal | Unix signals |
SML90 | top-level compatibility with SML’90 |
String | string manipulation |
StringCvt | conversion to and from strings |
Substring | manipulation of constant-time substrings |
TextIO | text input-output streams (imperative) |
Time | time points and durations |
Timer | measuring real time and cpu time |
Unix | starting concurrent subprocesses under Unix |
Vector | immutable constant-time-access vectors |
Word | words (31-bit unsigned integers) |
Word8 | bytes (8-bit unsigned integers) |
Word8Array | arrays of bytes |
Word8Vector | vectors of bytes |
Different installations will add many more structures. For example Moscow ML adds Arraysort, Binarymap, BinIO, Dynarray, Dynlib, Intset, Meta, Mosmlcgi, Mysql, Regex, Signal, Splaymap, Gdimage, among others.
The Moscow ML Library Reference is online.