A collection is an object that holds a group of objects. There a so many variations on how a collection’s elements are arranged, and what operations are allowed. Before we look at specific collections, let’s take a minute to appreciate how wild the scope of variations is when it comes to collections:
So many collections come with the Java Core API. Let’s get a big picture and look at a few examples.
Know the DocsThe official documentation comes with some decent reads. Check out the official Collections Overview and the official Collections Tutorial.
Start by learning the names of following main interfaces (hang in there, well see descriptions later):
Iterable └── Collection ├── Set │ └── SortedSet │ └── NavigableSet ├── List └── Queue ├── Deque | └──────────────┐ └── BlockingQueue │ ├── BlockingDeque └── TransferQueue Map ├── SortedMap │ └── NavigableMap | └───────────────┐ └── ConcurrentMap │ └── ConcurrentNavigableMap
Next, get a feel for the fact that there are quite a few implementations for each of the interfaces. Here are almost all of the classes, organized by the interfaces they implement (we’ve left out the crazy-specific ones, like JobStateReasons). For now, just look at the names and get a sense of what’s there; descriptions are coming up very soon.
Interface | Classes |
---|---|
Set | HashSet, LinkedHashSet, EnumSet, CopyOnWriteArraySet |
NavigableSet | TreeSet, ConcurrentSkipListSet |
List | ArrayList, LinkedList, CopyOnWriteArrayList |
Queue | ConcurrentLinkedQueue, PriorityQueue |
Deque | ArrayDeque, ConcurrentLinkedDeque, LinkedList |
BlockingQueue | ArrayBlockingQueue, LinkedBlockingQueue, PriorityBlockingQueue, SynchronousQueue, DelayQueue |
TransferQueue | LinkedTransferQueue |
BlockingDeque | LinkedBlockingDeque |
Map | HashMap, LinkedHashMap, EnumMap, IdentityHashMap, WeakHashMap |
NavigableMap | TreeMap |
ConcurrentMap | ConcurrentHashMap |
ConcurrentNavigableMap | ConcurrentSkipListMap |
Now it’s time to describe what each interface is all about. You’ll see some scary-looking syntax, like <T>
, <?>
, <? super T>
, <? extends T>
and type names such as Supplier
, Consumer
, and Predicate
. These will be explained when we get to examples.
Don‘t PanicThese notes are provided more as a reference than a tutorial. Read in order to get the “big picture” sense of what each of the interfaces and classes provide. No one memorizes all the details.
If something is iterable, you can (1) obtain an iterator for it, (2) obtain a spliterator for it, and (3) iterate over it with its forEach
method. This interface also has special powers: the Java statement for (var x: a) {...}
is defined to work only when a
is an array or is of a class implementing Iterable
.
public interface Iterable<T> { default void forEach(Consumer<? super T> action); Iterator<T> iterator(); default Spliterator<T> spliterator(); }
A collection is something that (1) can be added to and removed from, (2) can be asked for its size and whether it is empty, (3) has a membership test, (4) can have its items streamed, and (5) can have its items dumped into an array.
public interface Collection<E> extends Iterable<E> { int size(); boolean isEmpty(); boolean contains(Object o); boolean add(E e); boolean remove(Object element); default boolean removeIf(Predicate<? super E> filter); boolean containsAll(Collection<?> c); boolean addAll(Collection<? extends E> c); boolean removeAll(Collection<?> c); boolean retainAll(Collection<?> c); void clear(); default Stream<E> stream(); default Stream<E> parallelStream(); Object[] toArray(); <T> T[] toArray(T[] a); default <T> T[] toArray(IntFunction<T[]> generator); }
A set is a collection that contains no duplicate elements. More formally, sets contain no pair of elements e1
and e2
such that e1.equals(e2)
, and at most one null
element.
At the interface level, sets don’t add anything to the collection interface except a couple static convenience methods for creating unmodifiable sets. It is expected that all implementations of the interface have constructors that ensure no duplicate elements are created.
public interface Set<E> extends Collection<E> { static <E> Set<E> of(E... elements); static <E> Set<E> copyOf(Collection<? extends E> c); }
A sorted set is a set that further guarantees that its iterator will traverse the set in ascending element order, sorted according to the natural ordering of its elements, or by a comparator provided at sorted set creation time. Because of the internal ordering you can get subsets of a set that contain those elements in a certain range of values.
public interface SortedSet<E> extends Set<E> { Comparator<? super E> comparator(); E first(); E last(); SortedSet<E> subSet(E fromElement, E toElement); SortedSet<E> headSet(E toElement); SortedSet<E> tailSet(E fromElement); }
A navigable set is a sorted set will additional features.
public interface NavigableSet<E> extends SortedSet<E> { Iterator<E> descendingIterator(); E floor(E e); E ceiling(E e); E lower(E e); E higher(E e); E pollFirst(); E pollLast(); NavigableSet<E> descendingSet(); NavigableSet<E> subSet(E fromElement, boolean fromInclusive, E toElement, boolean toInclusive); NavigableSet<E> headSet(E toElement, boolean inclusive); }
A list is an ordered collection (also known as a sequence). The user of this interface has precise control over where in the list each element is inserted. The user can access elements by their integer index, and search for elements in the list.
public interface List<E> extends Collection<E> { E get(int index); E set(int index, E e); void add(int index, E e); E remove(int index); boolean addAll(int index, Collection<? extends E> c); default void replaceAll(UnaryOperator<E> operator); default void sort(Comparator<? super E> c); int indexOf(Object o); int lastIndexOf(Object o); static <E> List<E> of(E... elements); static <E> List<E> copyOf(Collection<? extends E> coll); ListIterator<E> listIterator(); ListIterator<E> listIterator(int index); List<E> subList(int fromIndex, int toIndex); }
A map is an object that maps keys to values. A map cannot contain duplicate keys; each key can map to at most one value.
public interface Map<K,V> { boolean containsKey(Object key); boolean containsValue(Object value); boolean isEmpty(); int size(); V put(K key, V value); default V putIfAbsent(K key, V value); default V compute(K key, BiFunction<? super K, ? super V, ? extends V> mapper); default V computeIfPresent(K key, BiFunction<? super K, ? super V, ? extends V> mapper); default V computeIfAbsent(K key, Function<? super K, ? extends V> mapper); default V replace(K key, V value); default boolean replace(K key, V oldValue, V newValue); default V merge(K key, V value, BiFunction<? super V, ? super V, ? extends V> remapper); V get(Object key); default V getOrDefault(Object key, V defaultValue); V remove(Object key); default boolean remove(Object key, Object value); void putAll(Map<? extends K, ? extends V> t); default void replaceAll(BiFunction<? super K, ? super V, ? extends V> function); void clear(); Set<K> keySet(); Collection<V> values(); Set<Map.Entry<K, V>> entrySet(); static <K,V> Map<K,V> ofEntries(Map.Entry<? extends K, ? extends V>... entries) static <K,V> Map<K,V> copyOf(Map<? extends K, ? extends V> map); default void forEach(BiConsumer<? super K, ? super V> action); static <K,V> Map.Entry<K,V> entry(K k, V v) public interface Entry<K,V> { K getKey(); V getValue(); V setValue(V value); } }
A sorted map is a map that is iterated in ascending key order, sorted according to the natural ordering of its keys, or by a comparator provided at sorted map creation time.
public interface SortedMap<K, V> extends Map<K, V> { Comparator<? super K> comparator(); K firstKey(); K lastKey(); SortedMap<K,V> subMap(K fromKey, K toKey); SortedMap<K,V> headMap(K toKey); SortedMap<K,V> tailMap(K fromKey); }
Any class implementing this interface is supposed to make sure that all operations are thread-safe and atomic.
public interface ConcurrentMap<K, V> extends Map<K, V> { // No methods added }
A queue is a place to store things until you need them later. Different kinds of queues will impose different ordering restrictions, like FIFO, LIFO, or priority-first. Each of the add, remove, and peek operations come in pairs: one that returns a value on failure and one that throws on failure.
public interface Queue<E> extends Collection<E> { boolean offer(E e); // adds if possible (return false if it can't) boolean add(E e); // adds if possible (throw if it can't) E poll(); // retrieve and remove head (return null if empty) E remove(); // retrieve and remove head (throw if empty) E peek(); // retrieve but do not remove head (return null if empty) E element(); // retrieve but do not remove head (throw if empty) }
A blocking queue is a queue with operations that wait for the queue to become non-empty when retrieving an element, and that wait for space to become available in the queue when storing an element.
public interface BlockingQueue<E> extends Queue<E> { boolean offer(E e, long timeout, TimeUnit unit) throws InterruptedException; void put(E e) throws InterruptedException; boolean add(E e); E poll(long timeout, TimeUnit unit) throws InterruptedException; E take() throws InterruptedException; int drainTo(Collection<? super E> c); int drainTo(Collection<? super E> c, int maxElements); int remainingCapacity(); }
A transfer queue is a blocking queue in which producers may wait for consumers to receive elements.
public interface TransferQueue<E> extends BlockingQueue<E> { boolean hasWaitingConsumer(); int getWaitingConsumerCount(); void transfer(E e) throws InterruptedException; boolean tryTransfer(E e); boolean tryTransfer(E e, long timeout, TimeUnit unit) throws InterruptedException; }
Here's a description of most of them, anyway:
Again, Don‘t PanicYa ain’t gonna need all of these. They are listed here only to give you an appreciation of all of what Java gives you. Learn them as needed. You will most likely start with
ArrayList
,HashMap
, andTreeMap
. Maybe that is all you will use for 95% of your work. That is OK.
Class | Implemented Interfaces |
Thread safe? |
Description |
---|---|---|---|
HashSet | Set | No | A simple set that stores elements based on their hash codes.
If the hashCode() function is good, the add ,
remove , and contains methods should run in
near constant time.
|
LinkedHashSet | Set | No | A HashSet whose values, when you iterate over them, come out in the same order you put them in. |
EnumSet | Set | No | A set that can only contain enum values of a given type. This is a very high-performance set (backed by a bit-vector). |
TreeSet | NavigableSet | No | A set whose values, when you iterate over them, come out in sorted order. A red-black tree is used behind the scenes. |
ArrayList | List | No | List implemented with a resizable array. |
LinkedList | List Deque |
No | Doubly-linked list implementation. Good when insertions and deletions are frequent. Access via the Deque interface gives you constant time access to the beginning and end elements. |
HashMap | Map | No | Good general-purpose map that hashes on element keys. Nulls are supported for keys and values. |
LinkedHashMap | Map | No | A HashMap whose entries are iterated over in the order they were inserted. |
WeakHashMap | Map | No | A hash map with weak keys — when a key object isn't referenced by anything except the key reference in the map, the entry will get removed. |
IdentityHashMap | Map | No | A map that uses == , rather than Object.equals() on its keys. Runs fast. |
EnumMap | Map | No | A map whose keys can only be enum values of a given type. This is a very high-performance map (backed by an array). |
TreeMap | NavigableMap | No | A map whose keys, when you iterate over them, come out in sorted order. A red-black tree is used behind the scenes. |
PriorityQueue | Queue | No | An unbounded priority queue implemented with a heap. It uses the element's natural ordering or a comparator passed in at construction time. The head element is the smallest. offer(), poll(), remove() and add run in logarithmic time. remove(Object) runs in linear time. peek, element, and size run in constant time. |
CopyOnWriteArrayList | List | Yes | A thread-safe variant of ArrayList in which all mutative operations (add, set, and so on) are implemented by making a fresh copy of the underlying array. |
ConcurrentLinkedQueue | Queue | Yes | Thread-safe, non-blocking, unbounded FIFO queue, optimized for use in situations where several threads are inserting and deleting items. Null elements are not allowed. |
ArrayBlockingQueue | Queue | Yes | A bounded, non-extensible, blocking FIFO queue, backed by a fixed-size array. |
LinkedBlockingQueue | Queue | Yes | A blocking FIFO queue. You can optionally supply a maximum capacity at construction time; the default is Integer.MAX_SIZE . Large capacities are okay because the elements are only allocated as needed. |
PriorityBlockingQueue | Queue | Yes | Essentially an unbounded priority queue with blocking semantics. Here unbounded means that there will be blocking when trying to remove from an empty queue, but the queue really only fills up when system resources are exhausted and an OutOfMemoryError is thrown. Nulls are not allowed. |
DelayQueue | Queue | Yes | An unbounded, blocking queue, whose elements must all implement the Delayed interface. An element can only be taken from the queue if its delay has expired. The head of the queue is the one whose delay expired furthest in the past. |
SynchronousQueue | Queue | Yes | Here is a verbatim quote from the official documentation: A blocking queue in which each insert operation must wait for a corresponding remove operation by another thread, and vice versa. A synchronous queue does not have any internal capacity, not even a capacity of one. You cannot peek at a synchronous queue because an element is only present when you try to remove it; you cannot insert an element (using any method) unless another thread is trying to remove it; you cannot iterate as there is nothing to iterate. The head of the queue is the element that the first queued inserting thread is trying to add to the queue; if there is no such queued thread then no element is available for removal and poll() will return null . For purposes of other Collection methods (for example contains ), a SynchronousQueue acts as an empty collection. This queue does not permit null elements. |
There are three ways two iterate over a collection or map: (1) A for-loop, (2) a forEach
method, or (3) an explicit iterator object.
Set<Dog> kennel = new HashSet<Dog>(); // For loop for (Dog d : kennel) { d.bark(); } // Foreach method kennel.forEach(d -> d.bark()) // The explicit iterator way for (Iterator<Dog> it = kennel.iterator(); i.hasNext();) { it.next().bark(); }
The first two only work if kennel
as an iterator
method that returns an iterator object. What's that iterator thingy? There's an interface in java.util:
public interface Iterator<E> { boolean hasNext(); E next(); void remove(); default void forEachRemaining(Consumer<? super E> action) }
The remove()
operation is in there to make it safe to update a collection while an iteration is in progress. If you remove an element using a method on the collection, the next use of the iterator will throw a ConcurrentModificationException
:
for (Iterator<Dog> it = kennel.iterator(); it.hasNext();) { Dog d = it.next(); if (d.getBreed() == Dog.POODLE) {kennel.remove(d);// NOOOOOOOOOOOOOOO! it.remove(); // Yes, do it this way } }
Explicit vs. implicit iteratorsIf you use a for loop or a
forEach
method, an iterator is created for you behind the scenes, but you cannot access it through your code! So you'll not be able to call remove, among other things. Therefore, if you need to mutate the list during traversal, or if you need to iterate over two or more collections at a time, use iterator objects explicitly.
If you know you are using a list, you can get a java.util.ListIterator
instead — it's got more features:
public interface ListIterator<E> extends Iterator<E> { boolean hasNext(); E next(); boolean hasPrevious(); E previous(); int nextIndex(); int previousIndex(); void remove(); void set(E e); void add(E e); }
You can easily sort a list of strings in Java. Just write:
Collections.sort(aStringList);
But suppose you created an Employee class like this:
class Employee { private String id; private Date hireDate; private Date birthDate; private BigDecimal salary; private Department department; . . . }
and wrote:
Collections.sort(anEmployeeList);
you'd get a compile-time error, because Java doesn’t know how to sort employees. It can only sort objects of a class that implements the interface java.util.Comparable
. This interface has a method compareTo
which defines the natural order on elements of that type. The invocation:
x.compareTo(y)
returns a negative integer if x is “less than” y, zero if they are “equal”, and a positive integer if x is “greater than” y. The Collections.sort
method uses compareTo
internally to do its work. Fortunately, many Java classes already implement Comparable
, such as String
, Date
, Integer
, BigInteger
, Time
, TimeUnit
, Enum
, Long
, LocalDate
, Duration
, and many others.
But for employees, you may want to sort them by pretty much anything, so they might not have a specific natural order. Instead we need a way to sort by age, or by salary, or, well just about any dimension, really. So we’d like to specify the sort dimension in the sort call, using a comparator:
Collections.sort(anEmployeeList, ageComparator);
where ageComparator
is an object implementing the Comparator
interface. Comparator
has one method, called compare
. The invocation:
c.compare(x, y)
returns a negative integer if x is "less than" y, zero if they are "equal", and a positive integer if x is "greater than" y. (Yep, just like compare
.)
Thankfully, the Comparator
interface is one of Java’s functional interfaces, meaning that you don’t have to create your own comparator class and override the compare
function and all that boilerplate; instead, you can just write the compare function as a lambda expression when you sort:
Collections.sort(anEmployeeList, (e1, e2) -> e1.birthDate().compareTo(e2.birthDate()));
Here is a simple application illustrating how you make a class that both implements Comparable
and defines additional comparators.
import java.util.ArrayList; import java.util.Collections; import java.util.Comparator; import java.util.Date; import java.util.List; record Dog(String name, Date birthday, String breed) implements Comparable<Dog> { // Though it feels somewhat arbitrary, we're going to make the natural order for // dogs to be an ordering by name. It's fine for example code whose purpose is // to highlight the comparable interface vs. comparators. public int compareTo(Dog that) { return this.name().compareTo(that.name()); } } /** * A short application showing how to use Comparable and Comparator. */ public class OrderingDemo { private static List<Dog> dogs = new ArrayList<Dog>(List.of( new Dog("Nimbus", new Date(), "Bedlington Terrier"), new Dog("Blue Jay", new Date(), "Siberian Husky"), new Dog("Mulberry", new Date(), "Dachshund"), new Dog("Lucy", new Date(), "Border Collie"))); public static void main(String[] args) { System.out.println("Sorted naturally"); Collections.sort(dogs); dogs.forEach(System.out::println); System.out.println("Sorted naturally in reverse"); Collections.sort(dogs, Collections.reverseOrder()); dogs.forEach(System.out::println); System.out.println("Sorted by breed"); Collections.sort(dogs, (d1, d2) -> d1.breed().compareTo(d2.breed())); dogs.forEach(System.out::println); System.out.println("Sorted by breed in reverse"); Comparator<Dog> breedComparator = (d1, d2) -> d1.breed().compareTo(d2.breed()); Collections.sort(dogs, Collections.reverseOrder(breedComparator)); dogs.forEach(System.out::println); } }
There are good notes on this topic in the Java Tutorial page on Object Ordering.
Ordering PrimitivesIf you wanted to order objects by a field whose value was a primitive type (like
int
ordouble
) there is of course no way to write things like:because primitives cannot have methods called on them. Instead you would have to write:p1.score().compareTo(p2.score())Integer.compareTo(p1.score(), p2.score())Yet another case of that infuriating distinction Java has between primitives and non-primitives. Kotlin programmers are happy they don’t have to deal with such silliness.
You can get an unmodifiable, or read-only, view of an existing collection, for example:
var theGang = Collections.unmodifiableList(kennel);
After this, calls such as:
theGang.add(bijou);
will throw UnsupportedOperationException
.
The static methods List.of
, List.copyOf
, Set.of
, Set.copyOf
, Map.of
, and Map.copyOf
also produce unmodifiable collections, as do a handful of other operations like Collections.emptySet
, Collections.singletonList
, and Collections.nCopies
.
Not all of the algorithms that run on collections, maps, and arrays are methods in collection classes. Instead, a large number of general purpose algorithms are defined within the utility classes Collections
and Arrays
. Descriptions of some of these utility functions follow.
Dozens of utility methods can be found in the Collections
class. Here are some of the most interesting ones:
Here are some of the interesting methods from Arrays
.
We’ve covered:
Collections
Arrays