There are at least two ways to list programming languages:
But if we want to categorize languages, we need to look at things like:
Wikipedia has a categorization page that might be interesting. There are even be different ways to categorize the categorizations. Some categorizations focus on technical issues, others look at non-technical issues (markets, hardware platforms, and so on).
Technical aspects of languages will consider linguistic structure, expressive features, possibility of efficient implementation, direct support for certain programming models, and similar concerns. Some examples:
These types are not mutually exclusive: Perl is both high-level and scripting; C is considered both high-level and system. Some languages are partially visual, but you get to type bits of code into little boxes.
Other types people have identified: Toy, Educational, Very High-Level, Authoring, Compiled, Interpreted, Free-Form, Curly Brace, Applicative, Homoiconic, Von Neumann, Expression-Oriented, Persistent, Concurrent, Data-Flow, Array, Stack-Based, Concatenative, Action, Reactive, Constraint, Glue, Reflective, Query, Intermediate, Decision Table, Quantum, Hybrid, Embeddable, Macro, Tactile.
See Wikipedia’s category page on programming language classification.
Machine language is the direct representation of the code and data run directly by a computing device. Machine languages feature:
add
, sub
, div
, sqrt
) which operate on these registers and/or memoryThe machine instructions are carried out directly in the hardware of the machine, so machine code is, by definition, machine-dependent. Different machines have different instruction sets. The instructions and their operands are all just bits.
How are machine instruction sets designed?Many machine languages appear to be just thrown together with a lot of general purpose instructions. But there have been processors designed specifically for executing implementation of high-level languages. Read about the Borroughs 5000 series and successors and the Intel 432.
Here’s an example program, expressed as a bit stream, for the Intel 64 architecture:
1000100111111000101010010000000100000000000000000000000001110101
0000011001101011110000000000001111111111110000001100001111000001
111000000000001010000011111010000000001111000011
Can you tell what it does?
No?
Oh, yeah, that was hard to read since it was in binary. Traditionally, machine is written in hex. So here is the same program, now a lot easier to read?
89 F8 A9 01 00 00 00 75 06 6B C0 03 FF C0 C3 C1 E0 02 83 E8 03 C3
Now it’s quite clear that this is a function that accepts a 32-bit integer argument (call it $n$), and if it’s even, produces $3n+1$, and if odd, produces $4n-3$.
Wait, that wasn’t obvious? Maybe we should find a way to encode the instructions in a more readable fashion.
An assembly language is an encoding of machine code into something more readable. It assigns human-readable labels (or names) to storage locations, jump targets, and function starting addresses, but doesn’t really go too far beyond that. It’s really isomorphic to its machine language. Here’s the function from above on the Intel 64 architecture using the GAS assembly language:
.globl f .text f: mov %edi, %eax # Put first parameter into eax register test $1, %eax # Examine least significant bit jnz odd # If it's not a zero, jump to odd imul $3, %eax # It's even, so multiply it by 3 inc %eax # and add 1 ret # and return it odd: shl $2, %eax # It's odd, so multiply by 4 sub $3, %eax # and subtract 3 ret # and return it
And here’s the same function, written for the SPARC:
.global f f: andcc %o0, 1, %g0 bne .L1 sll %o0, 2, %g2 sll %o0, 1, %g2 add %g2, %o0, %g2 b .L2 add %g2, 1, %o0 .L1: add %g2, -3, %o0 .L2: retl nop
Virtual, or abstract, machines are not tied to any particular hardware. They are used as an intermediate step in compilation or can be directly interpreted. These machines have their own instruction sets, their own machine languages, and often, a “text format” that serves as an assembly language. Here’s the function above in the text format for the Java Virtual Machine (JVM):
0: iload_0 1: iconst_2 2: irem 3: ifne 12 6: iconst_3 7: iload_0 8: imul 9: iconst_1 10: iadd 11: ireturn 12: iconst_4 13: iload_0 14: imul 15: iconst_3 16: isub 17: ireturn
Another popular virtual machine is Web Assembly (WASM). Here’s the function above in the WASM text format (WAT):
(module (func $f (param $n i32) (result i32) (local $result i32) (if (result i32) (i32.eqz (i32.rem_s (local.get $n) (i32.const 2))) (i32.add (i32.mul (i32.const 3) (local.get $n)) (i32.const 1)) (i32.sub (i32.mul (i32.const 4) (local.get $n)) (i32.const 3)) ) ) (export "f" (func $f)) )
Many compiler writers are fond of LLVM, which has its own intermediate language. Here’s the function above in LLVM’s intermediate language:
define i32 @f(i32 %n) { entry: %rem = srem i32 %n, 2 %cmp = icmp eq i32 %rem, 0 br i1 %cmp, label %even, label %odd even: %mul1 = mul i32 %n, 3 %add1 = add i32 %mul1, 1 ret i32 %add1 odd: %mul2 = mul i32 %n, 4 %sub1 = sub i32 %mul2, 3 ret i32 %sub1 }
A high-level language gets away from all the constraints of a machines, whether real or virtual. HLLs may have features such as:
2 * (y^5) >= 88 && sqrt(4.8) / 2 % 3 == 9
)The previous example looks like this in Fortran 77 (note how the code begins in column 7 or beyond):
INTEGER FUNCTION F(N) INTEGER N IF (MOD(N, 2) .EQ. 0) THEN F = 3 * N + 1 ELSE F = 4 * N - 3 END IF RETURN END
and like this in Fortran 90 (where the column requirements were finally removed):
integer function f (n) implicit none integer, intent(in) :: n if (mod(n, 2) == 0) then f = 3 * n + 1 else f = 4 * n - 3 end if end function f
and like this in Ada:
function F (N: Integer) return Integer is begin if N mod 2 = 0 then return 3 * N + 1; else return 4 * N - 3; end if; end F;
and like this in C and C++:
int f(const int n) { return (n % 2 == 0) ? 3 * n + 1 : 4 * n - 3; }
and like this in Java and C#:
class ThingThatHoldsTheFunctionUsedInTheExampleOnThisPage { public static int f(int n) { return (n % 2 == 0) ? 3 * n + 1 : 4 * n - 3; } }
and like this in Scala:
def f(n: Int) = if (n % 2 == 0) 3 * n + 1 else 4 * n - 3;
and like this in Kotlin:
fun f(n: Int) = if (n % 2 == 0) 3 * n + 1 else 4 * n - 3
and like this in JavaScript:
function f(n) { return (n % 2 === 0) ? 3 * n + 1 : 4 * n - 3; }
and like this in CoffeeScript:
f = (n) -> if n % 2 == 0 then 3 * n - 1 else 4 * n + 3
and like this in Smalltalk:
f ^self % 2 = 0 ifTrue:[3 * self + 1] ifFalse:[4 * self - 3]
and like this in Standard ML:
fun f n = if n mod 2 = 0 then 3 * n + 1 else 4 * n - 3
and like this in Elm:
f n = if n % 2 == 0 then 3 * n + 1 else 4 * n - 3
and like this in Haskell (thanks @kaftoot):
f n | even(n) = 3 * n + 1 | otherwise = 4 * n - 3
and like this in Julia (yes, 3n is “three times n”):
f(n) = iseven(n) ? 3n+1 : 4n-3
and like this in Lisp:
(defun f (n) (if (= (mod n 2) 0) (+ (* 3 n) 1) (- (* 4 n) 3)))
and like this in Clojure:
(defn f [n] (if (= (mod n 2) 0) (+ (* 3 n) 1) (- (* 4 n) 3)))
and like this in Prolog:
f(N, X) :- 0 is mod(N, 2), X is 3 * N + 1. f(N, X) :- 1 is mod(N, 2), X is 4 * N - 3.
and like this in Erlang:
f(N) when (N band 1) == 0 -> 3 * N + 1; f(N) -> 4 * N - 3.
and like this in Perl:
sub f { my $n = shift; $n % 2 == 0 ? 3 * $n + 1 : 4 * $n - 3; }
and like this in Python:
def f(n): return 3 * n + 1 if n % 2 == 0 else 4 * n - 3
and like this in Ruby:
def f(n) n % 2 == 0 ? 3 * n + 1 : 4 * n - 3; end
and like this in Go:
func f(n int) int { if n % 2 == 0 { return 3 * n + 1 } else { return 4 * n - 3 } }
and like this in Odin:
f :: proc(n: int) -> int { return n % 2 == 0 ? 3 * n + 1 : 4 * n - 3; }
and like this in Rust:
fn f(n: int) -> int { if n % 2 == 0 {3 * n + 1} else {4 * n - 3} }
and like this in Swift:
func f(n: Int) -> Int { return n % 2 == 0 ? 3 * n + 1 : 4 * n - 3 }
and like this in Gleam:
fn f(n: Int) -> Int { case n % 2 == 0 { True -> 3 * n + 1 False -> 4 * n - 3 } }
and like this in K:
f:{:[x!2;(4*x)-3;1+3*x]}The language Forth is interesting. It’s a high-level language that people can write in, but it has intermediate language vibes. The language is nothing more than a sequence of words that are concatenated together. And like the JVM, the code is executed on a stack. Here’s the function in Forth:
: f 2 mod 0= if 3 * 1 + else 4 * 3 - then ;
System programming languages differ from application programming languages in that they are more concerned with managing a computer system rather than solving general problems in health care, game playing, or finance. In a system language, the programmer, not the runtime system, is generally responsible for:
Scripting languages are used for wiring together systems and applications at a very high level. They are almost always extremely expressive (they do a lot with very little code) and usually dynamic (meaning the compiler does very little, while the run-time system does almost everything).
An esoteric language is one not intended to be taken seriously. They can be jokes, near-minimalistic, academic studies of computing theory, or despotic (purposely obfuscated or non-deterministic).
See Wikipedia’s article on esoteric languages.
John Ousterhout once claimed that programming languages roughly fall into two types, which he called scripting and system languages. You can read about this idea at Wikipedia. Then read this two-part article (Part 1, Part 2) on the dichotomy and on languages that seem to reject it.
Kay wrote, many years ago, while recording the early history of Smalltalk (a must read):
Programming languages can be categorized in a number of ways: imperative, applicative, logic-based, problem-oriented, etc. But they all seem to be either an "agglutination of features" or a "crystallization of style." COBOL, PL/1, Ada, etc., belong to the first kind; LISP, APL— and Smalltalk—are the second kind. It is probably not an accident that the agglutinative languages all seem to have been instigated by committees, and the crystallization languages by a single person.
C++ gets bashed a lot. The language’s designer has a great response to these critics:
There are only two kinds of languages: the ones people complain about and the ones nobody uses.
Very often a programming language is created to help people program in a certain way. A programming paradigm is a style, or “way,” of programming. Some languages make it easy to write in some paradigms but not others.
Never use the phrase “programming language paradigm.”A paradigm is a way of doing something (like programming), not a concrete thing (like a language). Now, it’s true that if a programming language L happens to make a particular programming paradigm P easy to express, then we often say “L is a P language” (e.g. “Haskell is a functional programming language”) but that does not mean there is any such thing as a “functional language paradigm”.
You should know these common paradigms:
Others include: pure functional, lazy, object-oriented, automata-like, concurrent, concatenative, tacit, point-free, intentional, literate, reactive, generic, non-deterministic, quantum.
Paradigms are not meant to be mutually exclusive; a single program can feature multiple paradigms!
Make sure to check out Wikipedia’s entry on Programming Paradigms.
How about an overview of some of the major paradigms?
Control flow in imperative programming is explicit: commands show how the computation takes place, step by step. Each step affects the global state of the computation. The most basic form of imperative program uses explicit jumps and labels, mimicking a machine:
result = [] i = 0 start: numPeople = length(people) if i >= numPeople goto finished p = people[i] nameLength = length(p.name) if nameLength <= 5 goto nextOne upperName = toUpper(p.name) addToList(result, upperName) nextOne: i = i + 1 goto start finished: return sort(result)
Structured programming is a kind of imperative programming that gets rid of the jumps and labels! Instead, control flow is made implicit by through combining and nesting statement blocks that carry out conditional and iterative flow by virtue of how they are organized. Variables can be made local to blocks, significantly reducing confusion and the potential for many errors.
result = []; for i = 0; i < length(people); i++ { p = people[i]; if length(p.name)) > 5 { addToList(result, toUpper(p.name)); } } return sort(result);
Early languages emphasizing structured programming include Algol 60, PL/I, Algol 68, Pascal, C, Ada 83, Modula, Modula-2. Structured programming as a discipline is sometimes though to have been started by a famous letter by Edsger Dijkstra titled Go to Statement Considered Harmful. The structured programming revolution of the late 1960s and early 1970s marked an enormous advance in the field of software engineering.
OOP is based on the sending of messages to objects. Objects respond to messages by performing operations, called methods. Messages can have arguments. A society of objects, each with their own local memory and own set of operations, has a much different feel than the monolithic-processor-plus-single-shared-memory architecture of non object-oriented languages.
One of the more visible aspects of the more pure-ish OO languages is that conditionals and loops are just messages whose arguments are blocks of executable code. In a Smalltalk-like syntax:
result := List new. people each: [:p | p name length greaterThan: 5 ifTrue: [result add (p name upper)] ] result sort. ^result
This can be shortened to:
^people filter: [:p | p name length greaterThan: 5] map: [:p | p name upper] sort
Many popular languages that call themselves OO languages (e.g., Java, C++), really just take some elements of OOP and mix them in to imperative-looking code. In the following, we can see that length
and toUpper
are methods rather than top-level functions, but the for
and if
are back to being control structures:
result = [] for p in people { if p.name.length > 5 { result.add(p.name.toUpper); } } return result.sort;
The first object oriented language was Simula-67; Smalltalk followed soon after as the first “pure” object-oriented language. Many languages designed from the 1980s to the present have labeled themselves object-oriented, notably C++, CLOS (object system of Common Lisp), Eiffel, Modula-3, Ada 95, Java, C#, Ruby.
Control flow in declarative programming is implicit: the programmer states only what the result should look like, not how to obtain it.
select upper(name) from people where length(name) > 5 order by name
No loops, no assignments, etc. Whatever engine that interprets this code is just supposed go get the desired information, and can use whatever approach it wants.
In functional programming, control flow is expressed by combining function calls, rather than by assigning values to variables:
sort( fix(f => p => ifThenElse(equals(p, emptylist), emptylist, ifThenElse(greater(length(name(head(p))), 5), append(to_upper(name(head(p))), f(tail(p))), f(tail(people)))))(people))
Yikes! We’ll describe that later. For now, be thankful there’s usually syntactic sugar:
let fun uppercasedLongNames [] = [] | uppercasedLongNames (p :: ps) = if length(name p) > 5 then (to_upper(name p))::(uppercasedLongNames ps) else (uppercasedLongNames ps) in sort(uppercasedLongNames(people))
Huh? That still isn’t very pretty. Why do people like this stuff? Well the real power of this paradigm comes from passing functions to functions (and returning functions from functions).
sort( filter(s => length s > 5, map(p => to_upper(name p), people)))
We can do better by using the cool |>
operator. Here x |> f
just means f(x)
. The operator has very low precedence so you can read things left-to-right:
people |> map (p => to_upper (name p)) |> filter (s => length s > 5) |> sort
Let’s keep going! Notice that you wouldn’t write map(s => square(x))
, right? You would write map(square)
. We can do something similar above, but we have to use function composition, you know, (f o g)x
is f(g(x))
, so:
people |> map (to_upper o name) |> filter (s => length s > 5) |> sort
Here are three things to read to get the gist of functional programming:
With functional programming:
Some people like to say:
Many languages have a neat little thing called comprehensions that combine map and filter.
sorted(p.name.upper() for p in people if len(p.name) > 5)
Logic programming and constraint programming are two paradigms in which programs are built by setting up relations that specify facts and inference rules, and asking whether or not something is true (i.e. specifying a goal.) Unification and backtracking to find solutions (i.e.. satisfy goals) takes place automatically.
Languages that emphasize this paradigm: Prolog, GHC, Parlog, Vulcan, Polka, Mercury, Fnil.
There’s another paradigm called functional-logic programming, of which the Verse language from Epic Games is an example.
One of the characteristics of a language is its support for particular programming paradigms. For example, Smalltalk has direct support for programming in the object-oriented way, so it might be called an object-oriented language. OCaml, Lisp, Scheme, and JavaScript programs tend to make heavy use of passing functions around so they are called “functional languages” despite having variables and many imperative constructs.
There are two very important observations here:
“OOP to me means only messaging, local retention and protection and hiding of state-process, and extreme late-binding of all things. It can be done in Smalltalk and in LISP. There are possibly other systems in which this is possible, but I'm not aware of them.”
Here are some questions useful for your spaced repetition learning. Many of the answers are not found on this page. Some will have popped up in lecture. Others will require you to do your own research.
@_
to hold all the parameters passed to the function, so they are not named like they are in other languages. You can extract them with shift
.(x => f(x))
? f
We’ve covered: