A list is a positionally-ordered, completely unrestricted sequence. You can add, update, or remove items at any position. You can lookup an item by position, or find the index at which a given value appears.
There are tons of list operations. Here is a small sampling:
Operation | Description |
---|---|
add(x) | add item $x$ at the end |
addFirst(x) | add item $x$ at the beginning |
addLast(x) | add item $x$ at the end |
addAt(x, i) | add item $x$ so it is at position $i$ after the add. |
remove(x) | remove item $x$ |
removeAt(i) | remove the item at position $i$ |
updateAt(i, x) | replace the item at position $i$ with $x$ |
getAt(i) | return the element at position $i$ |
size() | return the number of items |
isEmpty() | return whether this list has no items |
contains(x) | return whether $x$ appears |
indexOf(x) | return the index of the first occurrence of $x$ |
lastIndexOf(x) | return the index of the last occurrence of $x$ |
indexOf(x, n) | return the index of the $n$th occurrence of $x$ |
reverse() | reverse this list |
shuffle() | permute the elements of this list |
sort() | sort the items according to their natural order |
sort(c) | sort the items using comparator $c$ |
subList(i,j) | the sublist from index $i$ inclusive to $j$ exclusive |
take(n) | same as subList(0, n) |
drop(n) | same as subList(n, size()) |
head() | same as subList(0, 1) |
tail() | same as subList(1, size()) |
map(f) | apply function $f$ to each element |
filter(p) | the elements for which predicate $p$ holds |
append(list) | The list made by appending another list to this list |
Many of these have mutable and immutable variations. If you are building an immutable list, some of these methods will return a new list which is just like the original except with the differences specified by the operation. For a mutable list, some of these methods will directly modify the list itself.
Lists can be used to model:
among other things.
Yes, it’s true: there's already a List
interface in the Java Core API and a few implementing classes (including ArrayList
and LinkedList
), but we're going to write lists from scratch because:
Interestingly, the difference between mutable lists and immutable lists is so great, there are separate interfaces for each.
Mutable vs. Immutable ListsThe flavor of immutable list we will study is known as a persistent data structure and is one of the powerful mechanisms underlying functional and logic programming.
Mutable lists are used for storing things in a collection that you intend to add items to and remove items from.
We start with a very basic mutable list interface with just a few operations:
import java.util.function.Consumer; public interface SimpleList { /** * Returns the number of elements in the list. */ int size(); /** * Inserts <code>item</code> as the new first item. */ void addFirst(Object item); /** * Inserts <code>item</code> as the new last item. */ void addLast(Object item); /** * Inserts <code>item</code> so that after insertion it is the item at the given * index position. * * @throws IndexOutOfBoundsException if index not in [0,size]. */ void add(int index, Object item); /** * Removes the item at the given index position. * * @throws IndexOutOfBoundsException if index not in [0,size). */ void remove(int index); /** * Returns the item at the given index position. * * @throws IndexOutOfBoundsException if index not in [0,size). */ Object get(int index); /** * Replaces the item at the given index position with <code>item</code>. * * @throws IndexOutOfBoundsException if index not in [0,size). */ void set(int index, Object item); /** * Returns the index of the first item that is <code>".equals"</code> to the * given item. */ int indexOf(Object item); /** * Applies <code>consumer</code> to each list element. */ void forEach(Consumer<Object> consumer); }
There are a few ways to represent mutable lists. We’ll look only at two: (1) A doubly-linked collection of nodes, and (2) a bounded array.
Professional linked implementations are generally doubly linked and make use of a "header node" that contains no information. These choices greatly simply insertion and deletion since we don’t have to worry about nulls.
There are no nulls in this implementation (except for one fleeting microsecond in the field declaration for the header, but you can probably figure out why that has to be):
import java.util.NoSuchElementException; import java.util.function.Consumer; import java.util.function.Predicate; import java.util.function.UnaryOperator; /** * A simple linked list class, implemented completely from scratch, using doubly * linked nodes and a cool "header" node to greatly simplify insertion and * deletions on the end of the list. */ public class SimpleLinkedList { /** * Doubly-linked node class, completely private to the List class, as clients * don't care about the implementation of the list. This is a regular class and * not a record because nodes are mutable. */ private class Node { Object item; Node next; Node previous; Node(Object item, Node next, Node previous) { this.item = item; this.next = next; this.previous = previous; } } /** * The list itself maintains only a reference to its "header" node. The header * is a node that does not store any data. Its 'next' field points to the first * item in the list and its 'previous' field points to the last item. This makes * all insertions and deletions uniform, even at the beginning and the end of * the list! */ private Node header = new Node(null, null, null); /** * The number of items in the list, stored to make size() O(1). */ private int size = 0; public SimpleLinkedList() { // The empty list is made by pointing the header to itself. header.next = header.previous = header; } public int size() { // The size stored as a field for easy lookup return size; } public void addFirst(Object item) { // The first node is the one right after the header. addBefore(item, header.next); } public void addLast(Object item) { // The last node is the one right before the header. addBefore(item, header); } public void add(int index, Object item) { // This one is trickier, as there is a special case. If the index is equal // to the size, we are just adding the new element at the end. Otherwise // we use the special nodeAt() utility to point to the existing node at the // desired position. Adding before this node does the trick. addBefore(item, (index == size ? header : nodeAt(index))); } public void remove(int index) { // The utility methods do all the work here. remove(nodeAt(index)); } public Object get(int index) { // Many thanks to the nodeAt() utility! return nodeAt(index).item; } public void set(int index, Object item) { // nodeAt() is sure cool, innit? nodeAt(index).item = item; } public int indexOf(Object item) { // Ah the old "linear search" pattern: a for-loop with a return in // the middle. Note that to iterate the actual nodes, we start at // header.next, since header does not hold a list item. The loop // terminates when we get back to the header. var index = 0; for (var node = header.next; node != header; node = node.next) { if (node.item.equals(item)) { return index; } index++; } return -1; } public void forEach(Consumer<Object> consumer) { // Iterate from the first node (the one after the header). for (var node = header.next; node != header; node = node.next) { consumer.accept(node.item); } } /** * Get the node at a given index. */ private Node nodeAt(int index) { // The only legal indexes are 0, 1, ... size-1. if (index < 0 || index >= size) { throw new IndexOutOfBoundsException(index + " for size " + size); } // Only way to get by index in a simple linked list is to walk and // count. There are other kinds of lists that give quicker access by // position, but those structures are for another time. Node node = header; for (var i = 0; i <= index; i++) { node = node.next; } return node; } private void addBefore(Object item, Node node) { var newNode = new Node(item, node, node.previous); newNode.previous.next = newNode; newNode.next.previous = newNode; size++; } /** * Removes the referenced node by making its neighbors point around it. */ private void remove(Node node) { // Just in case if (node == header) { throw new NoSuchElementException(); } node.previous.next = node.next; node.next.previous = node.previous; size--; } }
We will explore these implementations, in class, on a whiteboard.
When using arrays, we can opt for fixed-size or “expandable” structures. Unlike stacks and queues, lists allow insertion “in the middle” which for arrays can have nasty performance implications as elements have to be “sifted.” Then again, with arrays, we can look up elements by position immediately, but with links we have to “search.”
TODO images
Here is a bounded array implementation:
TODO
The “expandable array” implementation is left to you.
Array Lists vs. Linked ListsPay attention people this shows up all the time in internship interviews.
Operation ArrayList LinkedList Lookup by Position Fast (always instant) Slow (unless small index) Insert / Delete Slow (unless at end) Fast (once at position)
If lists are immutable, there is a beautiful (elegant) singly-linked implementation. Before we get to it, we need to change our thinking a bit. Instead of saying a list is a sequence of elements, we’re going to say something pretty neat:
A list is either empty or is an item attached to a list.
Really? It’s true! Notice:
[]
, is a list.8 :: []
is a list.2 :: 8 :: []
is a list.5 :: 2 :: 8 :: []
is a list.1 :: 5 :: 2 :: 8 :: []
is a list.So empty lists and nodes at the two “flavors” of a list. In pictures, our lists look like this:
Time for an implementation! And what a great time it is, to introduce the sealed interface!
We’ll start small, with the only abstract methods being size
, head
, tail
, at
, and forEach
. Two static methods will help with construction: of
and from
.
import java.util.NoSuchElementException; import java.util.function.Consumer; import java.util.function.Predicate; import java.util.function.UnaryOperator; /** * A simple immutable linked list, implement completely from scratch as a * singly-linked structure. There aren't too many operations, as additional will * be left for homework. * * The implementation uses a sum type because I have major null-phobia. Not * going to pay the billion-dollar mistake, not with this one. */ public sealed interface SimpleImmutableList permits EmptyList, ListNode { int size(); Object head(); SimpleImmutableList tail(); static SimpleImmutableList of(Object... items) { SimpleImmutableList list = new EmptyList(); for (var i = items.length - 1; i >= 0; i--) { list = new ListNode(items[i], list); } return list; } static SimpleImmutableList from(Object head, SimpleImmutableList tail) { return new ListNode(head, tail); } Object at(int index); void forEach(Consumer<Object> consumer); } final record EmptyList() implements SimpleImmutableList { public int size() { return 0; } public Object head() { throw new NoSuchElementException(); } public SimpleImmutableList tail() { throw new NoSuchElementException(); } public Object at(int index) { throw new NoSuchElementException(); } public void forEach(Consumer<Object> consumer) { // Intentionally empty: nothing to iterate } } final record ListNode( Object head, SimpleImmutableList tail) implements SimpleImmutableList { public int size() { return 1 + tail.size(); } public Object at(int index) { return index == 0 ? head : tail.at(index - 1); } public void forEach(Consumer<Object> consumer) { consumer.accept(head); tail.forEach(consumer); } }
We will explore these implementations, in class, on a whiteboard.
take
, drop
, reversed
, shuffled
, sorted
, map
, and filter
. Do some research to see what other kinds of methods appear in a list data type, and implement as many as you can.
This class is pretty large; we expect the test suite to be huge. Here is a partial list what we need to test:
add(i, x)
followed by get(i)
returns $x$, where $$ is legal.addLast()
100 times in succession, with the values 0 through 99, into an empty list, then calling get(i)
for all $i \in 0..99$ returns $i$ each time. And size()
will return 100.addFirst
and addLast
calls starting at an empty list, for values 1..10, will give the list [9 7 5 3 1 2 4 6 8 10]
.removeAt(0)
n times, the list is empty and has size 0.get
method does not affect the sizeset(i, x)
followed by get(i)
returns $x$, where $i$ is legal.indexOf
anything on an empty list is -1.indexOf("a")
on ["b", "a", "a"]
returns 1.lastIndexOf("a")
on ["b", "a", "a"]
returns 2.Unit tests are shown in a homework assignment.
We’ve covered: