Rust

Memory safety without garbage collection! Concurrency without Data Races! Abstraction without Overhead!

Why Rust?

Rust was designed to give the programmer as much direct control as C++, allowing:

...but without all the bugs that plague C++ programmers. Rust is safe! There are:

Rust compiles to the metal and doesn't need a runtime nor a garbage collector, so:

...although:

But if you do need unsafe features, they are available:

Impressive, right? A lot of people think so. From the StackOverflow 2022 Developer Survey:

mostlovedlangs.png

Rust wins the “most loved“ category — by quite a bit.

Here’s Jeff’s introduction:

Getting Started

Rust is a systems programming language, so you compile source files into machine executables. Say hello:

hello.rs
fn main() {
    println!("hello");
}

You can run this online at The Rust Playground, TIO, or Replit, you can install Rust yourself and run it on the command line:

$ rustc hello.rs && ./hello
hello

Like C, C++, and Go, you need that darn main.

Here is a program with loops and conditionals:

triple.rs
fn main() {
    for c in 1..=40 {
        for b in 1..c {
            for a in 1..b {
                if a * a + b * b == c * c {
                    println!("({a}, {b}, {c})");
                }
            }
        }
    }
}
$ rustc triple.rs && ./triple
(3, 4, 5)
(6, 8, 10)
(5, 12, 13)
(9, 12, 15)
(8, 15, 17)
(12, 16, 20)
(15, 20, 25)
(7, 24, 25)
(10, 24, 26)
(20, 21, 29)
(18, 24, 30)
(16, 30, 34)
(21, 28, 35)
(12, 35, 37)
(15, 36, 39)
(24, 32, 40)

Here is Phil Dorin's 180° clock hands problem:

clockhands.rs
fn main() {
    for i in 0..11 {
        let t = (((i as f64) + 0.5) * 43200.0 / 11.0) as i32;
        let (hours, remaining_seconds) = (t / 3600, t % 3600);
        let (minutes, seconds) = (remaining_seconds / 60, remaining_seconds % 60);
        println!("{:02}:{minutes:02}:{seconds:02}", if hours == 0 {12} else {hours});
    }
}
$ rustc clockhands.rs && ./clockhands
12:32:43
01:38:10
02:43:38
03:49:05
04:54:32
06:00:00
07:05:27
08:10:54
09:16:21
10:21:49
11:27:16

Note that rust writes its conditional expresions not with ? and : but with English keywords: if hours == 0 {12} else {hours}.

Our last introductory example features command line arguments, vectors, result objects, and a few other things. Don’t worry about what this program does right now. Just get a feel for what Rust “looks like.”

fib.rs
fn main() {
    let args: Vec<String> = std::env::args().collect();
    if args.len() != 2 {
        println!("Exactly one command line argument required");
        return;
    }
    let Ok(n) = args[1].parse() else {
        println!("Must be positive integer");
        return;
    };
    let (mut a, mut b) = (0, 1);
    while b <= n {
        print!("{b} ");
        (a, b) = (b, a + b);
    }
    println!("");
}
$ rustc fib.rs && ./fib
Exactly one command line argument required
$ ./fib 20000
1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765 10946 17711
$./fib dog
Must be positive integer

Learning Rust

Now that we have seen what Rust programs look like, it’s time to learn (a lot of) the language. Where can we do this? Here are some of the important reads:

Getting advanced? Try:

Let’s take our own language tour now.

Variables

Introduce new variables with let. If a variable is not used, that’s an error. Not a warning but an error! Unless, that is, the variable’s name starts with an underscore.

fn main() { 
    let x = 1;
    println!("{x}");    // used it
    let y = 2;          // ERROR: y is not used
    let _z = 3;         // Ok, don't *have* to use this one
}

Subsequent declarations of variables with the same name shadow previous ones.

fn main() { 
    let x = 1;          // introduces a variable
    println!("{x}");    // prints 1
    let x = "sweet";    // completely different variable, shadows previous x
    println!("{x}");    // prints sweet
}
Exercise: Try this example, and as many of these on this page as you can, in the Rust Playground.

Variables are immutable by default. Use mut to make the variable mutable.

fn main() { 
  let x = 1;
  x = 2;                // cannot assign twice to immutable variable `x`
  println!("{x}");      // prints 1
  let mut y = 3;        // mut makes it mutable
  println!("{y}");      // prints 3
  y = 5;                // ok to mutate
  println!("{y}");      // prints 5
}

Blocks create new scopes. Variables introduced in blocks are in scope only from their declaration to the end of the block.

fn main() { 
    let x = 1;
    {                       // a new block
        println!("{x}");    // prints 1 (value of x from outside)
        let x = 2;          // a completely new var, shadows outer x
        println!("{x}");    // prints 2
    }
    println!("{x}");        // prints 1 (we are back outside)
}

Rust always constrains the types of values allowed for a variable. Normally the constraint is inferred from the values assigned to it (including an initializer, if present). The constraints are checked at compile-time.

fn main() {
    let mut x = 2;      // x constrained to an integer type
    println!("{x}");
    x = 5;              // assignment is ok!
    println!("{x}");
    x = 3.0;            // mismatched types: expected integer found floating-point number
}

If there is no initializer, you can’t use the variable until you’ve assigned to it. Then it gets its type constraint.

fn main() {
    let x;              // Perfectly ok
    println!("{x}");    // ERROR: binding `x` is possibly-uninitialized
    x = 5;              // Now initialized, and constrained to integer type
    println!("{x}");    // Happily prints 5
}

You can explicitly supply a constraint even if an initializer is present.

fn main() {
    let a: i32 = 5;
    let b: i8 = 3;
    let c: char = '😀';
    let d: [i32; 3] = [10, 20, -30];
    let e: bool = false;
    let f: f64 = 3.8E-5;
    let g: String = String::from("hello");
    println!("{a} {b} {c} {d:?} {e} {f} {g}");
    // prints 5 3 😀 [10, 20, -30] false 0.000038 hello
}

The technique of explicitly specifying a type constraint is useful when the type cannot be inferred. If you prefer, you can help the inferencer in another way, using the turbofish:

fn main() {
    let unique: Vec<char> = "hello".chars().collect();
    let list = "hello".chars().collect::<Vec<char>>();
    println!("{}", unique == list); // true

    let two_pi: f64 = "6.283185307179586".parse().unwrap();
    let tau = "6.283185307179586".parse::<f64>().unwrap();
    println!("{}", two_pi == tau); // true
}

The let statement introduces variables via patterns. The pattern is what goes on the left hand side of the statement. The right hand side has the initializer expression. Examples:

fn main() { 
    struct Point { x: f64, y: f64 }
    
    let p = Point { x: 3.3, y: -1.0 };      // identifier pattern
    let (a, b) = (3.5, "interesting");      // tuple pattern
    let [c, d] = [false, true];             // array pattern
    let Point {x: e, y: f} = p;             // struct pattern
    println!("{a} {b} {c} {d} {e} {f}");
    // prints 3.5 interesting false true 3.3 -1

    let Point {x, y} = p;       // x is short for x:x
    let [z, ..] = [1, 2, 3];    // .. stands in for "all remaining"
    println!("{x} {y} {z}");    // prints 3.3 -1 1
}

A pattern that might not match is called refutable. You can define with “let-else” to handle a refutable pattern.

fn main() { 
    let mut v = vec![1, 2, 3];
    let Some(t) = v.pop() else {
        return;
    };
    println!("{t}");    // prints 3
}
Have you noticed something?

We haven’t thrown any exceptions yet, nor can you in Rust. Rust prefers errors as values. You will see a lot of optionals and results. Get used to doing things this way. It’s modern. There’s no hidden control flow. You’ll like it!

There is more about patterns below.

Statements

There are basically three kinds of statements. The first two are essentially declarations and the rest are...well, expressions. Rust is often called an expression-oriented language.

KindPurposeNotes
Let StatementsIntroduces one or more new variables(Many examples above)
Item DeclarationsDeclares an itemItems can be: modules, extern crates, use declarations, functions, type aliases, structs, enums, unions, constants, statics, traits, implementations, extern blocks, macros
Expression StatementsEvaluates an expression and “throws away” the result, evaluating the expression only for its side effectsExpression statements can be: literals, paths, operators, groups, arrays, awaits, indexings, tuples, tuple indexings, structs, calls, method calls, field accesses, closures, async blocks, continues, breaks, ranges, returns, underscores, macros, blocks, unsafe blocks, infinite loops, predicate loops (whiles), predicate pattern loops (while-let) iterator loops (for), ifs, if-lets, matches, labeled expressions

It’s rather interesting that expressions are statements, and many things you normally think of as statements in other languages (if, while, match, etc.) are actually expressions.

fn main() {
    // A plain loop expression produces ! if it runs forever,
    // or the value of the break expression if it breaks out
    let mut count = 0;
    let result = loop {
        count += 1;
        if count == 10 {
            break count * 2;
        }
    };
    println!("{result}");               // prints 20

    // While (and for) expressions produce ()
    let mut count = 1;
    let result = while count < 10 { count += 1 };
    println!("{result:?}");             // prints ()

    // Match expressions produce what they produce
    let result = match count {
        0 | 1 => "binary-ish",
        2 => "couple",
        3..=8 => "a few",
        _ => "lots"
    };
    println!("{result}");               // prints "lots"
}

There is some power in this idea. Blocks are sequences of statements, but the blocks themselves are expressions! The value of the block is the value of the last expression in the block, but be careful with semicolons: the empty expression is an expression, so you probably don’t want a final semicolon if you want to the block expression to have a value.

fn main() {
    let eight = {
        let x = 3;
        x + 5                     // No final semicolon, produces 8
    };
    let unit = {
        let x = 3;
        x + 5;                    // Final semicolon, produces ()
    };
    println!("{eight}");          // prints 8
    println!("{unit:?}");         // prints () 
}

This works well for functions bodies, which are...blocks. So you don’t often see the keyword return, if at all:

fn fib(n: i64) -> i64 {
    let (mut a, mut b) = (0, 1);
    for _ in 0..n {
        (a, b) = (b, a + b);
    }
    a
}

fn main() {
  println!("{}", fib(50));       // prints 12586269025
}

Types

What types are available to you? A lot! Here are the built-in ones. All of them have values of a fixed size, so we know exactly how much space each of their values need at compile time. This is necessary to make programs run as fast as possible:

TypeExamplesNotes
i824i8-128...127
i16-30000i16-32768...32767
i3299310855i32-2147483648...2147483647
i64-89i64-9223372036854775808...9223372036854775807
isize3500022isizeSigned integer with size of a pointer on the host architecture
u8255u80..255
u164095u160...65535
u32999999999u320..4294967295
u642u640..18446744073709551615
usize89553421138usizeUnsigned integer with size of a pointer on the host architecture
f323.2e-7f32IEEE 754 binary32
f641f64IEEE 754 binary64
booltruetrue or false
char'å'32-bit Unicode scalar value in range 0x0000–0xD7FF, 0xE000–0x10FFFF
str(Cannot write out values of this type)Basically a [u8], that is, a slice (see below) of 8-bit unsigned bytes, whose elements must constitute a valid UTF-8 sequence. Values of this type are created with quotes, e.g., "hello".
!(This type has no values)This is called the Never type. There are no values of this type. It is the return type of functions that don’t return, or the type of infinite loop expressions.

And if you want to make your own types, there are a crazy amount of ways to do this. Here are a few of them:

Kind of typeExamplesNotes
Tuple types()The typical positional product type.
(u16, char, f64)
Array types[i32; 5]A sequence. Note the length is part of the type. Example: [1,1,2,3,5].
Slice types[i32]A subsequence of an array. You can’t work with slices directly (that is, you cannot even write out values of a slice type), but you can work with references to slices.
Struct typesstruct Point {
  x: f64,
  y: f64
}
The typical named product type. Cannot be anonymous.
Enum typesenum RaceResult {
  Time(f64),
  DidNotFinish,
}
Rust’s sum type. Cannot be anonymous.
Union typesunion u {
  f: f64,
  i: u64
}
A way to put different things at the same memory address. Probably only useful when interoperating with C or in unsafe code.
Pointer types&i64Shared reference
&mut i64Mutable reference
*const i64Raw pointer, unsafe
*mut i64Mutable raw pointer, unsafe
Function pointer typesfn(i64,bool)->charThe type of references to functions with the given argument types and result type.
Trait object typesdyn PersonThe type of opaque values of another type that implements a set of traits. Used for that dynamic polymorphism thing.

Not only can you make your own types, but so have thousands of other programmers. Many are defined in the Rust standard library. Here are a few:

ModuleTypes
std::collectionsHashMap, BTreeMap, HashSet, BTreeSet, VecDeque, LinkedList, BinaryHeap
std::vecVec
std::optionOption
std::iterIterator, DoubleEndedIterator, ExactSizeIterator, Extend, FromIterator, IntoIterator, Iterator, Product, Sum
std::timeDuration, Instant, SystemTime, UNIX_EPOCH
std::ioRead, Write, BufRead, Seek, Cursor, Chain, Empty, Repeat, Take, Window
alloc::boxedBox
std::netIpAddr, Ipv4Addr, Ipv6Addr, SocketAddr, SocketAddrV4, SocketAddrV6, UdpSocket, TcpListener, TcpStream
std::pathPath, PathBuf
std::threadThread, ThreadId, JoinHandle, Builder, LocalKey
std::syncArc, Weak, RwLock, Mutex, MutexGuard, Condvar, Barrier, Once, OnceCell, BarrierWaitResult

Basic Types

The basic types are numbers, booleans, characters, and tuples. Numbers have the usual arithmetic (+, -, *, /, %) and bitwise (&, |, ^, <<, >>) operators. Logical operators are && and ||. Comparison operators are ==, !=, <, <=, >, and >=. Ranges are made with .. and ..=.

Here is an example that shows off tuples. They can be indexed and destructured:

fn main() {
    let t = (5, 8, 2);
    println!("{t:?}");                   // (5, 8, 2)
    println!("{} {} {}", t.0, t.1, t.2); // 5 8 2
    let (x, y, z) = t;
    println!("{x} {y} {z}");             // 5 8 2
}

Arrays and Slices

Arrays are fixed-size sequences of elements of the same type, whose size is known at compile time, and is part of the type. Slices are dynamically-sized views into arrays; they are stored with two words: (1) a reference to the first element of the slice and (2) the length of the slice. Because a slice is a view into an underlying array, they are always borrowed, so you always see them with the & operator.

fn main() {
    // These types could have been inferred, but we're showing them anyway
    let a: [i32; 5] = [10, 20, 30, 40, 50];
    let b: &[i32] = &a[1..4];
    println!("{} {} | {a:?}", a[0], a[4]);   // 10 50 | [10, 20, 30, 40, 50]
    println!("{} {} | {b:?}", b[0], b[2]);   // 20 40 | [20, 30, 40]
}

Arrays and slices have a lot of syntactic niceties, including destructuring and ranges:

TODO

And a lot of standard library operations:

fn main() {
    let mut a = [1, 2, 3, 4, 5];
    a.reverse();
    println!("{a:?}");          // [5, 4, 3, 2, 1]
    a.sort();
    println!("{a:?}");          // [1, 2, 3, 4, 5]
    a.sort_by(|a, b| b.cmp(a));
    println!("{a:?}");          // [5, 4, 3, 2, 1]
    a.sort_by_key(|x| x % 3);
    println!("{a:?}");          // [3, 4, 1, 5, 2]
    a.swap(1, 3);
    println!("{a:?}");          // [3, 5, 1, 4, 2]
    a.rotate_left(2);
    println!("{a:?}");          // [1, 4, 2, 3, 5]
    a.rotate_right(2);
    println!("{a:?}");          // [3, 5, 1, 4, 2]
}

Strings

Rust strings are not as bad as C strings, or are they?

rust-string-meme.jpg

Conceptually, strings are sequences of bytes that comprise valid UTF-8 encodings. You are most likely to use the following two types:

Functions

TODO

Here’s what the Rust Reference says about return expressions: Return expressions are denoted with the keyword return. Evaluating a return expression moves its argument into the designated output location for the current function call, destroys the current function activation frame, and transfers control to the caller frame.”

Structs

Structs are pretty similar to tuples, but while tuple components are positional, struct components are named.

TODO example

Enumerations

Rust’s sum types are called enumerations, or just enums. You generally use them in match, if let or while let expressions:

enum RaceResult {
    Time(f64),
    DidNotFinish,
}

fn main() {
    let alice_result = RaceResult::Time(53.23);
    let bob_result = RaceResult::DidNotFinish;

    // match expression
    match bob_result {
        RaceResult::Time(time) => println!("Bob's time: {time}"),
        RaceResult::DidNotFinish => println!("Bob did not finish"),
    }

    // if-let expression
    if let RaceResult::Time(time) = alice_result {
        println!("Alice's time: {time}");
    }
}

Optionals

Results

Ownership and Borrowing

How does Rust prevent memory leaks and dangling references? Through the concept of ownership.

Every resource has an owner. When the owner “goes away,” the resource is dropped, meaning its storage is reclaimed.

fn main() {
    let mut picks = vec![8, 21, 5, 1, 2];
    say_hello();
    picks.push(34);
    println!("{picks:?}");
} // picks goes out of scope and is dropped (freed) here

fn say_hello() {
    let mut greeting = String::from("Hello");
    greeting.push_str(", world!");
    println!("{greeting}");
} // greeting goes out of scope and is dropped (freed) here

Data is owned by default. But there is region-based borrowing. Individual types can be marked as copy types. And there are (C++-style) destructors.

The big problem is the combination of aliasing with mutation. Can we separate those things?

More on Pointers

Impls and Traits

Functional Programming

Rust supports functional programming

Closures

TODO

Concurrency

Patterns

We’ve seen uses of patterns above, but did you know there are a lot of different kinds of patterns? There are: Literal patterns, Identifier patterns, Wildcard patterns, Rest patterns, Reference patterns, Struct patterns, TupleStruct patterns, Tuple patterns, Grouped patterns, Slice patterns, Path patterns, and Range patterns.

And patterns aren’t just for let statements. They can be used in:

TODO - need a big example with lots of patterns

Unsafe

Modules

Rust programs are made up of crates. Each crate is a collection of items. Because creates can contain thousands of items, the items may be grouped into modules. The modules are arranged in a hierarchy within the crate. You can think of the the “top-level” of the crate as an unnamed module.

To refer to items, use ::. Some examples from above:

But if you use the use keyword, you don’t have to give the full path to the item.

TODO - more about modules

Cargo

Ok, we’ve talked about Rust the language, but how do you build Rust applications in real life? You use Cargo. Cargo is Rust’s build system and its package manager. In practice, you’ll never use rustc; instead, you will build and manage all your projects with Cargo. Make sure you have it installed:

$ cargo --version
cargo 1.82.0 (8f40fc59f 2024-08-21)

Run these two commands now to get a sense of what cargo can do:

$ cargo --help
$ cargo --list

Building Simple Executables

For our first real-life application, we’ll make a simple command line greeter script. Rust packages up an executable app into a binary crate. Here’s how we initialize it:

$ cargo new greeter
    Creating binary (application) `greeter` package

This creates a new folder called greeter with these files:

greeter
├── Cargo.toml
└── src
    └── main.rs

Cargo.toml holds your project’s metadata. It will probably look like this when initialized:

[package]
name = "greeter"
version = "0.1.0"
edition = "2021"

[dependencies]

Cargo initialized a basic hello-world program in main.rs, let’s update it to use a command line argument:

fn main() {
    let args: Vec<String> = std::env::args().collect();
    let name = if args.len() > 1 { &args[1] } else { "world" };
    println!("Hello, {}!", name);
}

Now build and run the program. Make sure you are in the greeter folder. As Rust is a compiled language, let’s build first then run in a subseqent step:

$ cargo build
    Compiling greeter v0.1.0 (/Users/rtoal/projects/greeter)
     Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.69s

Building creates a few new files:

.
├── Cargo.lock
├── Cargo.toml
├── src
│   └── main.rs
└── target
    ├── CACHEDIR.TAG
    └── debug
        ├── build
        ├── deps
        │   └── (dozens of files here)
        ├── examples
        ├── greeter (this is the executable)
        ├── greeter.d
        └── incremental
            └── (dozens of files here)

So building created an executable in the folder target/debug. You can run it directly like so:

$ ./target/debug/greeter
Hello, world!
$ ./target/debug/greeter Alice
Hello, Alice!

Alternatively, you could run with cargo run, though note that doing so generates output from Cargo that you might not want.

$ cargo run Alice
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.05s
     Running `target/debug/greeter Alice`
Hello, Alice!

What’s this debug thing? By default, cargo build produces an executable with a lot of information packed into it that allows for symbolic debugging. When you make a final production build, known as a release build, you don’t need or want it all. You want a lighter build. Here’s how you do that:

$ cargo build --release
    Compiling greeter v0.1.0 (/Users/rtoal/projects/greeter)
     Finished `release` profile [optimized] target(s) in 0.55s

Fewer files created:

.
├── Cargo.lock
├── Cargo.toml
├── src
│   └── main.rs
└── target
    ├── CACHEDIR.TAG
    └── release
        ├── build
        ├── deps
        │   └── (just a couple of files here)
        ├── examples
        ├── greeter (this is the executable)
        ├── greeter.d
        └── incremental (no files here)

To run the release build, you can again do cargo run or run directly:

$ ./target/release/greeter Alice
Hello, Alice!

Using External Dependencies

TODO - example with rand and serde

Programs with Multiple Files

TODO

Building Libraries

Instead of a binary (executable) crate, you can build a library crate. You will do this if you want to publish your crate that other folks can use in their own applications.

Recall that when building executables, cargo arranges execution to begin at the function main in main.rs. For library crates, we’ll write the most important code in lib.rs. Let’s get started building a library holding a generic Queue type:

$ cargo new queue --lib
    Creating library `queue` package
.
├── Cargo.toml
└── src
    └── lib.rs

We’ll implement the queue in lib.rs:

use std::collections::VecDeque;

pub struct Queue<T> {
    items: VecDeque<T>
}

impl<T> Queue<T> {
    pub fn new() -> Self {
        Queue { items: VecDeque::new() }
    }

    pub fn add(&mut self, item: T) {
        self.items.push_back(item);
    }

    pub fn remove(&mut self) -> Option<T> {
        self.items.pop_front()
    }
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_queue() {
        let mut q = Queue::new();
        q.add(1);
        q.add(2);
        assert_eq!(q.remove(), Some(1));
        assert_eq!(q.remove(), Some(2));
        assert_eq!(q.remove(), None);
    }
}

Note that the Rust convention is to put the tests in the same source file. Here’s how to run the tests:

$ cargo test
Finished `test` profile [unoptimized + debuginfo] target(s) in 0.05s
Running unittests src/lib.rs (/Users/rtoal/projects/queue/target/debug/deps/queue-6d5d97537e37655a)

running 1 test
test tests::test_queue ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s