Software systems manipulate different kinds of things like accounts, calendars, cards, contacts, rational numbers, dates, windows, animals, carrots, buildings, countries, players, toolbars, menus, songs, artists, and playlists. Humans naturally think about these things, in terms of what they can and cannot do. You can sing songs but not windows; players can move but numbers cannot; accounts have a balance but carrots don’t. What things can and cannot do gives rise to the notion of type.
In many (but not all!) programming languages, you can create new types via the class construct. A class is a factory to make objects that all share a common structure and behavior. The class is, in almost every language, carried around with the objects and can be inspected at run time, even if the language purports to be “statically typed”.
Behavior. The allowed operations. An object can have multiple types.
A factory for creating objects. An object is created by, and “has”, exactly one class.
A class defines the properties (state) and operations (behavior) for its instances, and may include constructors for creating instances. It may even include some metadata, too. A convenient way to show off a class, in a language-independent fashion, is to diagram it (here I’ve used a notation from UML):
To read this diagram:
-
means private and +
means public.Class members are essentially top-level entities: we write Point.ORIGIN
and Point.midpoint
. Instance members are attached to individual instances: we write p.x
and p.distanceFromOrigin()
.
Now how we made private properties and public operations, and the way we named our operations shows that our intent is to make points immutable and polygons mutable.
SecuritySecure Software Development stresses we should favor immutability when at all possible. When something needs to be mutable, we must control the mutability, either by prohibiting copies or by always making defensive copies. Also, we should validate the arguments to every method. In the examples that follow, we will try to follow those principles.
What is the UML, you may ask?
A good resource is Scott Ambler's, check it out.
In most languages, a class gives rise to a type. In Java, for example, given:
interface Printable { ... } interface Runner { ... } class Animal implements Printable, Serializable { ... } class Dog extends Animal implements Runner { ... } var winner = new Dog(...);
the object bound to winner
has exactly one class, namely Dog
, but it has many types: Dog
, Runner
, Animal
, Printable
, Serializable
, Object
.
Classes are unrelated to Object-Oriented ProgrammingYou can do object-oriented programming very well without classes. You can even have classes without object-orientation. People often get them confused, since most explanations of OOP feature classes prominently, but again, this not need be the case.
A class is a factory for objects, and every object it creates has a given structure and behavior. Thus the class serves as a type for the instances that it creates. Classes come in many variations:
Some programming languages have abused the idea of a class, using it for other things. For example, Java’s desire to make everything a class led to the idea of a utility class, which isn’t a factory or a type at all, but rather a big namespace for functions, which it calls “static methods.” Java’s designers probably thought they had a great idea at the time, but no, it’s not good. And it gets worse perhaps: the inability of Java to house any code outside of a class means any app you write, even a short command line script, must be housed in a class, with code launched from a static method called main
. And you may have guessed it, such a class is called an application class.
Let’s implement our point and polygon classes, which we diagrammed above, in a few languages.
Note that Point
is a data class—immutable, with value semantics—while Polygon
is a mutable everyday class. Remember that mutability does not play well with references, so when we store the points in a polygon, we need to use a container with value semantics, not reference semantics; or, if the language does not give us containers with value semantics, then we must make defensive copies both on construction and when retrieving vertices.
Always handle mutable objects securely
If you are making a class for mutable objects, make sure you either (1) prevent these objects from being copied at all, or (2) make defensive copies of their fields. Otherwise you will end up with unintended sharing.
Ruby does classes pretty cleanly! Instance fields are marked with @
and are scoped entirely to the class. You need to write methods to access them, or use attr_reader
to automatically generate the accessor methods. Class fields are named beginning with @@
. Instance methods and class methods are easy to identify.
class Point
attr_reader :x, :y
def initialize(x, y)
@x = x
@y = y
end
@@origin = Point.new(0, 0)
def self.ORIGIN = @@origin
def self.midpoint_of(p, q) = Point.new((p.x + q.x) / 2, (p.y + q.y) / 2.0)
def distance_from_origin = Math.hypot(@x, @y)
def reflection_about_origin = Point.new(-@x, -@y)
end
class Polygon
def initialize(*points)
raise 'Need at least three points' if points.length < 3
@points = Array.new(points)
end
def perimeter
result = 0
@points.each_with_index do |p, i|
q = @points[(i + 1) % @points.length]
result += Math.hypot(p.x - q.x, p.y - q.y)
end
result
end
def area
result = 0
@points.each_with_index do |p, i|
q = @points[(i + 1) % @points.length]
result += p.x * q.y - q.x * p.y
end
result / 2
end
def vertices
@points.copy
end
def add_vertex(index, x, y)
raise "Cannot add at #{index}" if index < 0 or index > @points.length
@points.insert(index, Point.new(x, y))
end
def update_vertex(index, x, y)
raise "Cannot update at #{index}" if index < 0 or index >= @points.length
@points[index] = Point.new(x, y)
end
def remove_vertex(index)
raise "Cannot remove at #{index}" if index < 0 or index >= @points.length
@points.delete_at(index)
end
end
In the code above, we will (a) explain why points are immutable, (b) note how class methods vs. instance methods are defined, (c) explain why @@origin
exists, and (d) check that defensive copies are made for the polygon vertices both on construction and on query.
You can add new methods to Ruby classes after the fact, and existing instances will pick them up. 😮😮😮😳
JavaScript has a syntax for private (#
) properties and classwide (static
) properties. Immutability is ensured via Object.freeze
, which not only prevents the properties of an instance from being changed, but also permits the addition of new properties or the deletion of existing properties.
class Point {
constructor(x, y) { Object.assign(this, { x, y }); Object.freeze(this) }
get distanceFromOrigin() { return Math.hypot(this.x, this.y) }
get reflectionAboutOrigin() { return new Point(-this.x, -this.y) }
static ORIGIN = new Point(0, 0)
static midpointOf(p, q) { return new Point((p.x + q.x) / 2, (p.y + q.y) / 2.0) }
}
class Polygon {
#points
constructor(...points) {
if (points.length < 3) {
throw new Error('Need at least three points')
}
this.#points = points.slice()
Object.freeze(this)
}
get perimeter() {
let result = 0
for (let i = 0; i < this.#points.length; i += 1) {
const [p, q] = [this.#points[i], this.#points[(i + 1) % this.#points.length]]
result += Math.hypot(p.x - q.x, p.y - q.y)
}
return result
}
get area() {
let result = 0
for (let i = 0; i < this.#points.length; i += 1) {
const [p, q] = [this.#points[i], this.#points[(i + 1) % this.#points.length]]
result += (p.x * q.y) - (q.x * p.y)
}
return result / 2
}
get vertices() {
return this.#points.slice()
}
addVertex(index, x, y) {
if (index < 0 || index > this.#points.length) {
throw new Error(`Cannot add at index: ${index}`)
}
this.#points.splice(index, 0, new Point(x, y))
}
updateVertex(index, x, y) {
if (index < 0 || index >= this.#points.length) {
throw new Error(`Cannot update at index: ${index}`)
}
this.#points[index] = new Point(x, y)
}
removeVertex(index) {
if (index < 0 || index >= this.#points.length) {
throw new Error(`Cannot remove at index: ${index}`)
}
if (this.#points.length === 3) {
throw new Error('Removal would make this polygon degenerate')
}
this.#points.splice(index, 1)
}
}
Noticed we made a couple of the properties getters so they look like data properties.
We’ve prevented the addition of fields and methods to the point and polygon objects, but we can still modify the static (class) properties (try changing Point.ORIGIN
) and even add new properties to each class. If we want to prevent even that, we can add:
Object.freeze(Point) Object.freeze(Polygon)
We start with the point class. We want points to be immutable (and have value semantics), just the kind of thing Java has the record
keyword for:
public record Point(double x, double y) {
public static final Point ORIGIN = new Point(0, 0);
public Point {
if (Double.isNaN(x) || Double.isNaN(y)) {
throw new IllegalArgumentException("Coordinates can not be NaN");
}
}
public double distanceFromOrigin() {
return Math.hypot(x, y);
}
public Point reflectionAboutOrigin() {
return new Point(-x, -y);
}
public static Point midpointOf(Point p, Point q) {
return new Point((p.x + q.x) / 2.0, (p.y + q.y) / 2.0);
}
}
When using records, Java safely generates equals
, hashCode
, and toString
methods (though we can override if we want). Because you should know what is going on behind the scenes, here is the actual class that Java generates for the record above:
// Don't write this yourself, we're only showing what the record produces. import java.util.Objects; public class Point { public static final Point ORIGIN = new Point(0, 0); private final double x; private final double y; public Point(double x, double y) { if (Double.isNaN(x) || Double.isNaN(y)) { throw new IllegalArgumentException("Coordinates can not be NaN"); } this.x = x; this.y = y; } public double x() { return x; } public double y() { return y; } public double distanceFromOrigin() { return Math.hypot(x, y); } public Point reflectionAboutOrigin() { return new Point(-x, -y); } public static Point midpointOf(Point p, Point q) { return new Point((p.x + q.x) / 2.0, (p.y + q.y) / 2.0); } @Override public boolean equals(Object o) { return (o instanceof Point other) && x == other.x && y == other.y; } @Override public int hashCode() { return Objects.hash(x, y); } @Override public String toString() { return "Point[x=" + x + ", y=" + y + "]"; } }
Polygons will be mutable, but only through its own methods. As always, we have to be super, super careful here because the internal state contains a collection. We have to do prevent users of our class from modifying our points by obtaining a reference to our internal list of points. So just like in previous examples:
import java.util.List;
import java.util.ArrayList;
/**
* A mutable polygon containing at least three vertices, where the vertices
* are assumed to be listed in counter-clockwise order.
*/
public class Polygon {
private ArrayList<Point> points;
public Polygon(List<Point> points) {
if (points == null || points.size() < 3) {
throw new IllegalArgumentException("Need at least 3 vertices");
}
for (var point : points) {
if (point == null) {
throw new IllegalArgumentException("Null points are not allowed");
}
}
// Important to make a defensive copy!
this.points = new ArrayList<>(points);
}
public double perimeter() {
var result = 0.0;
for (var i = 0; i < points.size(); i++) {
var p = points.get(i);
var q = points.get((i + 1) % points.size());
result += Math.hypot(q.x() - p.x(), q.y() - p.y());
}
return result;
}
public double area() {
var result = 0.0;
for (var i = 0; i < points.size(); i++) {
var p = points.get(i);
var q = points.get((i + 1) % points.size());
result += p.x() * q.y() - q.x() * p.y();
}
return result / 2.0;
}
public List<Point> vertices() {
// Give the callers an immutable fixed-size list
return List.copyOf(points);
}
public void addVertex(int index, double x, double y) {
if (index < 0 || index > points.size()) {
throw new IllegalArgumentException("Cannot add at index: " + index);
}
points.add(index, new Point(x, y));
}
public void updateVertex(int index, double x, double y) {
if (index < 0 || index >= points.size()) {
throw new IllegalArgumentException("Cannot update at index: " + index);
}
points.set(index, new Point(x, y));
}
public void removeVertex(int index) {
if (index < 0 || index >= points.size()) {
throw new IllegalArgumentException("Cannot remove at index: " + index);
}
if (points.size() == 3) {
throw new IllegalStateException("Removal would make the polygon degenerate");
}
points.remove(index);
}
}
Here are the same classes in Kotlin:
data class Point(val x: Double, val y: Double) {
init {
require(!x.isNaN() && !y.isNaN()) {
"Coordinates can not be NaN"
}
}
companion object {
val ORIGIN = Point(0.0, 0.0)
fun midpointOf(p: Point, q: Point): Point {
return Point((p.x + q.x) / 2.0, (p.y + q.y) / 2.0)
}
}
fun distanceFromOrigin(): Double {
return kotlin.math.hypot(x, y)
}
fun reflectionAboutOrigin(): Point {
return Point(-x, -y)
}
}
Kotlin doesn’t use the word static
but instead uses a companion object. That way everything is an object with properties and methods. “Static” feels a little weird, but it is super common; Kotlin is trying to be cleaner, that’s all.
/**
* A mutable polygon containing at least three vertices, where the vertices are
* assumed to be listed in counter-clockwise order.
*/
class Polygon(vertices: List<Point>) {
private var vertices: MutableList<Point> = vertices.toMutableList()
init {
require(vertices.size >= 3) { "Need at least 3 vertices" }
// Defensive copy already handled by `toMutableList()`
}
fun perimeter(): Double {
var result = 0.0
for (i in vertices.indices) {
val p = vertices[i]
val q = vertices[(i + 1) % vertices.size]
result += kotlin.math.hypot(q.x - p.x, q.y - p.y)
}
return result
}
fun area(): Double {
var result = 0.0
for (i in vertices.indices) {
val p = vertices[i]
val q = vertices[(i + 1) % vertices.size]
result += p.x * q.y - q.x * p.y
}
return result / 2.0
}
fun vertices(): List<Point> {
// Return an immutable list as a defensive copy
return vertices.toList()
}
fun addVertex(index: Int, x: Double, y: Double) {
require(index in 0..vertices.size) {
"Cannot add at index: $index"
}
vertices.add(index, Point(x, y))
}
fun updateVertex(index: Int, x: Double, y: Double) {
require(index in 0 until vertices.size) {
"Cannot update at index: $index"
}
vertices[index] = Point(x, y)
}
fun removeVertex(index: Int) {
require(index in 0 until vertices.size) {
"Cannot remove at index: $index"
}
require(vertices.size > 3) {
"Removal would make the polygon degenerate"
}
vertices.removeAt(index)
}
}
Here are the same classes in Swift:
import Foundation
struct Point {
let x: Double
let y: Double
static let ORIGIN = Point(x: 0.0, y: 0.0)
func distanceFromOrigin() -> Double { return hypot(x, y) }
func reflectionAboutOrigin() -> Point { return Point(x: -x, y: -y) }
static func midpoint(of p: Point, and q: Point) -> Point {
return Point(x: (p.x + q.x) / 2.0, y: (p.y + q.y) / 2.0)
}
}
class Polygon {
private var points: [Point]
init(vertices: [Point]) {
guard vertices.count >= 3 else {
fatalError("Need at least 3 vertices")
}
// Automatic defensive copy bc Point is a struct
self.points = vertices
}
func perimeter() -> Double {
var result = 0.0
for i in 0..<points.count {
let p = points[i]
let q = points[(i + 1) % points.count]
result += hypot(q.x - p.x, q.y - p.y)
}
return result
}
func area() -> Double {
var result = 0.0
for i in 0..<points.count {
let p = points[i]
let q = points[(i + 1) % points.count]
result += p.x * q.y - q.x * p.y
}
return result / 2.0
}
func vertices() -> [Point] {
// Automatic defensive copy bc Point is a struct
return points
}
func addVertex(at index: Int, x: Double, y: Double) {
guard index >= 0 && index <= points.count else {
fatalError("Cannot add at index: \(index)")
}
points.insert(Point(x: x, y: y), at: index)
}
func updateVertex(at index: Int, x: Double, y: Double) {
guard index >= 0 && index < points.count else {
fatalError("Cannot update at index: \(index)")
}
points[index] = Point(x: x, y: y)
}
func removeVertex(at index: Int) {
guard index >= 0 && index < points.count else {
fatalError("Cannot remove at index: \(index)")
}
guard points.count > 3 else {
fatalError("Removal would make the polygon degenerate")
}
points.remove(at: index)
}
}
Here are the same classes in Python. Python is interesting because (1) there’s really no serious concept about hiding the “fields” from the outside (though there are conventions to do so), and (2) the receiver of the method is an explicit parameter to the constructors and instance methods:
import math
class Point:
def __init__(self, x, y):
self.x = x
self.y = y
@property
def distance_from_origin(self):
return math.hypot(self.x, self.y)
@property
def reflection_about_origin(self):
return Point(-self.x, -self.y)
@staticmethod
def midpoint_of(cls, p, q):
return Point((p.x + q.x) / 2, (p.y + q.y) / 2.0)
Point.ORIGIN = Point(0, 0)
class Polygon:
def __init__(self, *points):
if points.length < 3:
raise ValueError('Need at least three points')
self.points = list(points)
@property
def perimeter(self):
result = 0
for i in range(len(self.points)):
p, q = self.points[i], self.points[(i + 1) % self.points.length]
result += math.hypot(p.x - q.x, p.y - q.y)
return result
@property
def area(self):
result = 0
for i in range(len(self.points)):
p, q = self.points[i], self.points[(i + 1) % self.points.length]
result += (p.x * q.y) - (q.x * p.y)
return result / 2
@property
def vertices(self):
return list(self.points)
def addVertex(self, index, x, y):
if index < 0 or index > self.points.length:
raise ValueError(f"Cannot add at index: {index}")
self.points.splice(index, 0, Point(x, y))
def updateVertex(self, index, x, y):
if index < 0 or index >= self.points.length:
raise ValueError(f"Cannot update at index: {index}")
self.points[index] = Point(x, y)
def removeVertex(self, index):
if index < 0 or index >= self.points.length:
raise ValueError(f"Cannot remove at index: {index}")
if self.points.length == 3:
raise ValueError('Removal would make self polygon degenerate')
self.points.splice(index, 1)
When we design classes we pay attention to three aspects:
The protocol, interface, "contract", or behavior. Given primarily by constructor and method signatures.
(Should be hidden) The low-level structural details. Given by the field declarations.
(Should be hidden) The bodies of the constructors and methods.
The most visible structural components of classes (in most languages) are:
Usually the term multiple inheritance refers to a class being derived from, or extending, multiple classes. This is a much-debated feature—some languages have it, some do not. A lot of people hate it! The problem is that classes can have implementation, and implementation inheritance has a lot of ambiguities associated with it; it turns out to be very complex and messy.
What are some of the issues? Here’s an example to look at:
Assume here that d
is an object of class D
. We have quite a few decisions to make.
y
from classes B
and C
, making us wonder what the expression “d.y
” might mean. We could make it a compile time error to have the multiple inheritance because of the name clash. We could allow the the extension, but throw out the field y
. We could also envision d
having either one or two fields called y
(that is we can merge the fields). If we don't merge them, we might need some syntax like d.B::y
and d.C::y
(that is, require qualification).
z
from classes B
and C
. As in the previous item, we could disallow the extension or simply throw out the field z
. Or we can envision having either one or two fields called z
. If one, WHICH one, B
’s or C
’s? That is, would d.z
have type Integer or String? Do we resolve the ambiguity by alphabetical order or by the name of the class that appeared first in the extends
clause of class D
’s declaration? If two, we might need some syntax like d.B::z
and d.C::z
.
g
from classes B
and C
. Like in the previous items we could disallow the extension or simply throw out g
. If we allow the extension what would d.g(c)
mean? Which operation would it call, B
’s or C
’s? We could resolve this as in the previous item. Or maybe both should be called, but which one FIRST? Or maybe we inherit both bodies but require explicit qualification in the call, such as d.B::g(c)
and d.C::g(c)
.
h
; d
obviously has two h
’s that overload each other.
d
have two fields called x
or just one?
d
have two methods called f
or just one?
Pure interface Inheritance doesn't have these ambiguities. Since the methods in an interface don't have bodies, a class implementing multiple interfaces has only one implementation for a method even when there appears to be a name clash!
Does Java have pure interface inheritance? Well, not really. A Java interface may contain:
Some of these are susceptible to name clashes.
A few tips for class designers follow.
Think: every operation should succeed or fail. If it succeeds, great; if it fails, throw an exception or return an optional or a result value (that contains a success value or an error value). Do not send back “failure codes” only to clients if you can help it. Doing so puts the burden of checking on the client and many times the client programmer forgets the check.
Hiding the representation is a really good idea, for these four primary reasons:
In practice this means:
Don't stuff the class full of too many fields or too many methods. For example if your Customer
class has fields called street, city, state and zip as well as name and account number, you should introduce a new class called Address
. If you have way too many methods, you may need to think about factoring the responsibilities of the class into two or more classes.
Immutable objects (objects whose values never change after they have been created) can be awesome. They are:
Somtimes you'll want to hide constructors and instead expose methods that return new objects (often called static factory methods). Advantages:
Here are some questions useful for your spaced repetition learning. Many of the answers are not found on this page. Some will have popped up in lecture. Others will require you to do your own research.
We’ve covered: