There are at least two ways to LIST programming languages:
But if we want to CATEGORIZE languages, we need to look at the look and feel of the language, its execution model, or the kind of programming paradigms most naturally supported.
Wikipedia has a categorization page that might be interesting. There are even be different ways to categorize the categorizations. Some categorizations focus on technical issues, others look at non-technical issues (markets, hardware platforms, and so on).
Technical aspects of languages will consider linguistic structure, expressive features, possibility of efficient implementation, direct support for certain programming models, and similar concerns. Sone examples:
These types are not mutually exclusive: Perl is both high-level and scripting; C is considered both high-level and system. Some languages are partially visual, but you get to type bits of code into little boxes.
Other types people have identified: Toy, Educational, Very High-Level, Authoring, Compiled, Interpreted, Free-Form, Curly Brace, Applicative, Homoiconic, Von Neumann, Expression-Oriented, Persistent, Concurrent, Data-Flow, Array, Stack-Based, Concatenative, Action, Reactive, Constraint, Glue, Reflective, Query, Intermediate, Decision Table, Quantum, Hybrid, Embeddable, Macro, Tactile. See Wikipedia’s category page on programming language classification.
Machine language is the direct representation of the code and data run directly by a computing device. Machine languages feature:
add
, sub
, div
, sqrt
) which operate on these registers and/or memory
The machine instructions are carried out in the hardware of the machine, so machine code is by definition machine-dependent. Different machines have different instruction sets. The instructions and their operands are all just bits.
Machine code is usually written in hex. Here’s an example for the Intel 64 architecture:
89 F8 A9 01 00 00 00 75 06 6B C0 03 FF C0 C3 C1 E0 02 83 E8 03 C3
Can you tell what it does?
How are machine instruction sets designed?
Many machine languages appear to be just thrown together with a lot of general purpose instructions. But there have been processors designed specifically for executing implementation of high-level languages. Make sure to read about the Borroughs 5000 series and successors and the Intel 432.
An assembly language is an encoding of machine code into something more readable. It assigns human-readable labels (or names) to storage locations, jump targets, and subroutine starting addresses, but doesn’t really go too far beyond that. It’s really isomorphic to its machine language. Here’s the function from above on the Intel 64 architecture using the GAS assembly language:
.globl f .text f: mov %edi, %eax # Put first parameter into eax register test $1, %eax # Examine least significant bit jnz odd # If it's not a zero, jump to odd imul $3, %eax # It's even, so multiply it by 3 inc %eax # and add 1 ret # and return it odd: shl $2, %eax # It's odd, so multiply by 4 sub $3, %eax # and subtract 3 ret # and return it
And here’s the same function, written for the SPARC:
.global f f: andcc %o0, 1, %g0 bne .L1 sll %o0, 2, %g2 sll %o0, 1, %g2 add %g2, %o0, %g2 b .L2 add %g2, 1, %o0 .L1: add %g2, -3, %o0 .L2: retl nop
A high-level language gets away from all the constraints of a particular machine. HLLs may have features such as:
2 * (y^5) >= 88 && sqrt(4.8) / 2 % 3 == 9
)
The previous example looks like this in Fortran 77 (note how the code begins in column 7 or beyond):
INTEGER FUNCTION F(N) INTEGER N IF (MOD(N, 2) .EQ. 0) THEN F = 3 * N + 1 ELSE F = 4 * N - 3 END IF RETURN END
and like this in Fortran 90 (where the column requirements were finally removed):
integer function f (n) implicit none integer, intent(in) :: n if (mod(n, 2) == 0) then f = 3 * n + 1 else f = 4 * n - 3 end if end function f
and like this in Ada:
function F (N: Integer) return Integer is begin if N mod 2 = 0 then return 3 * N + 1; else return 4 * N - 3; end if; end F;
and like this in C and C++:
int f(const int n) { return (n % 2 == 0) ? 3 * n + 1 : 4 * n - 3; }
and like this in Java and C#:
class ThingThatHoldsTheFunctionUsedInTheExampleOnThisPage { public static int f(int n) { return (n % 2 == 0) ? 3 * n + 1 : 4 * n - 3; } }
and like this in Scala:
def f(n: Int) = if (n % 2 == 0) 3 * n + 1 else 4 * n - 3;
and like this in Kotlin:
fun f(n: Int) = if (n % 2 == 0) 3 * n + 1 else 4 * n - 3
and like this in JavaScript:
function f(n) { return (n % 2 === 0) ? 3 * n + 1 : 4 * n - 3; }
and like this in CoffeeScript:
f = (n) -> if n % 2 == 0 then 3 * n - 1 else 4 * n + 3
and like this in Smalltalk:
f ^self % 2 = 0 ifTrue:[3 * self + 1] ifFalse:[4 * self - 3]
and like this in Standard ML:
fun f n = if n mod 2 = 0 then 3 * n + 1 else 4 * n - 3
and like this in Elm:
f n = if n % 2 == 0 then 3 * n + 1 else 4 * n - 3
and like this in Haskell (thanks @kaftoot):
f n | even(n) = 3 * n + 1 | otherwise = 4 * n - 3
and like this in Julia (yes, 3n is “three times n”):
f(n) = iseven(n) ? 3n+1 : 4n-3
and like this in Lisp:
(defun f (n) (if (= (mod n 2) 0) (+ (* 3 n) 1) (- (* 4 n) 3)))
and like this in Clojure:
(defn f [n] (if (= (mod n 2) 0) (+ (* 3 n) 1) (- (* 4 n) 3)))
and like this in Prolog:
f(N, X) :- 0 is mod(N, 2), X is 3 * N + 1. f(N, X) :- 1 is mod(N, 2), X is 4 * N - 3.
and like this in Erlang:
f(N) when (N band 1) == 0 -> 3 * N + 1; f(N) -> 4 * N - 3.
and like this in Perl:
sub f { my $n = shift; $n % 2 == 0 ? 3 * $n + 1 : 4 * $n - 3; }
and like this in Python:
def f(n): return 3 * n + 1 if n % 2 == 0 else 4 * n - 3
and like this in Ruby:
def f(n) n % 2 == 0 ? 3 * n + 1 : 4 * n - 3; end
and like this in Go:
func f(n int) int { if n % 2 == 0 { return 3 * n + 1 } else { return 4 * n - 3 } }
and like this in Rust:
fn f(n: int) -> int { return if n % 2 == 0 {3 * n + 1} else {4 * n - 3} }
and like this in Swift:
func f(n: Int) -> Int { return n % 2 == 0 ? 3 * n + 1 : 4 * n - 3 }
and like this in K:
f:{:[x!2;(4*x)-3;1+3*x]}
System programming languages differ from application programming languages in that they are more concerned with managing a computer system rather than solving general problems in health care, game playing, or finance. In a system language, the programmer, not the runtime system, is generally responsible for:
Scripting languages are used for wiring together systems and applications at a very high level. They are almost always extremely expressive (they do a lot with very little code) and usually dynamic (meaning the compiler does very little, while the run-time system does almost everything).
An esoteric language is one not intended to be taken seriously. They can be jokes, near-minimalistic, or despotic (purposely obfuscated or non-deterministic).
See Wikipedia’s article on esoteric languages.
John Ousterhout once claimed that programming languages roughly fall into two types, which he called scripting and system languages. You can read about this idea at Wikipedia. Then read this two-part article (Part 1, Part 2) on the dichotomy and on languages that seem to reject it.
Programming languages can be categorized in a number of ways: imperative, applicative, logic-based, problem-oriented, etc. But they all seem to be either an "agglutination of features" or a "crystallization of style." COBOL, PL/1, Ada, etc., belong to the first kind; LISP, APL— and Smalltalk—are the second kind. It is probably not an accident that the agglutinative languages all seem to have been instigated by committees, and the crystallization languages by a single person.
There are only two kinds of languages: the ones people complain about and the ones nobody uses.
Very often a programming language is created to help people program in a certain way. A programming paradigm is a style, or “way,” of programming. Some languages make it easy to write in some paradigms but not others.
Never use the phrase “programming language paradigm.”A paradigm is a way of doing something (like programming), not a concrete thing (like a language). Now, it’s true that if a programming language L happens to make a particular programming paradigm P easy to express, then we often say “L is a P language” (e.g. “Haskell is a functional programming language”) but that does not mean there is any such thing as a “functional language paradigm”.
You should know these common paradigms:
Others include: pure functional, lazy, object-oriented, automata-like, concurrent, concatenative (a.k.a. tacit, or point-free), intentional, literate, reactive, generic, non-deterministic, quantum.
Paradigms are not meant to be mutually exclusive; a single program can feature multiple paradigms!
Make sure to check out Wikipedia’s entry on Programming Paradigms.
How about an overview of some of the major paradigms?
Control flow in imperative programming is explicit: commands show how the computation takes place, step by step. Each step affects the global state of the computation.
result = [] i = 0 start: numPeople = length(people) if i >= numPeople goto finished p = people[i] nameLength = length(p.name) if nameLength <= 5 goto nextOne upperName = toUpper(p.name) addToList(result, upperName) nextOne: i = i + 1 goto start finished: return sort(result)
Structured programming is a kind of imperative programming where control flow is defined by nested loops, conditionals, and subroutines, rather than via gotos. Variables are generally local to blocks (have lexical scope).
result = []; for i = 0; i < length(people); i++ { p = people[i]; if length(p.name)) > 5 { addToList(result, toUpper(p.name)); } } return sort(result);
Early languages emphasizing structured programming: Algol 60, PL/I, Algol 68, Pascal, C, Ada 83, Modula, Modula-2. Structured programming as a discipline is sometimes though to have been started by a famous letter by Edsger Dijkstra entitled Go to Statement Considered Harmful.
OOP is based on the sending of messages to objects. Objects respond to messages by performing operations, generally called methods. Messages can have arguments. A society of objects, each with their own local memory and own set of operations has a different feel than the monolithic processor and single shared memory feel of non object oriented languages.
One of the more visible aspects of the more pure-ish OO languages is that conditionals and loops become messages themselves, whose arguments are often blocks of executable code. In a Smalltalk-like syntax:
result := List new. people each: [:p | p name length greaterThan: 5 ifTrue: [result add (p name upper)] ] result sort. ^result
This can be shortened to:
^people filter: [:p | p name length greaterThan: 5] map: [:p | p name upper] sort
Many popular languages that call themselves OO languages (e.g., Java, C++), really just take some elements of OOP and mix them in to imperative-looking code. In the following, we can see that length
and toUpper
are methods rather than top-level functions, but the for
and if
are back to being control structures:
result = [] for p in people { if p.name.length > 5 { result.add(p.name.toUpper); } } return result.sort;
The first object oriented language was Simula-67; Smalltalk followed soon after as the first “pure” object-oriented language. Many languages designed from the 1980s to the present have labeled themselves object-oriented, notably C++, CLOS (object system of Common Lisp), Eiffel, Modula-3, Ada 95, Java, C#, Ruby.
Control flow in declarative programming is implicit: the programmer states only what the result should look like, not how to obtain it.
select upper(name) from people where length(name) > 5 order by name
No loops, no assignments, etc. Whatever engine that interprets this code is just supposed go get the desired information, and can use whatever approach it wants. (The logic and constraint paradigms are generally declarative as well.)
In functional programming, control flow is expressed by combining function calls, rather than by assigning values to variables:
sort( fix(f => p => ifThenElse(equals(p, emptylist), emptylist, ifThenElse(greater(length(name(head(p))), 5), append(to_upper(name(head(p))), f(tail(p))), f(tail(people)))))(people))
Yikes! We’ll describe that later. For now, be thankful there’s usually syntactic sugar:
let fun uppercasedLongNames [] = [] | uppercasedLongNames (p :: ps) = if length(name p) > 5 then (to_upper(name p))::(uppercasedLongNames ps) else (uppercasedLongNames ps) in sort(uppercasedLongNames(people))
Huh? That still isn’t very pretty. Why do people like this stuff? Well the real power of this paradigm comes from passing functions to functions (and returning functions from functions).
sort( filter(s => length s > 5, map(p => to_upper(name p), people)))
We can do better by using the cool |>
operator. Here x |> f
just means f(x)
. The operator has very low precedence so you can read things left-to-right:
people |> map (p => to_upper (name p)) |> filter (s => length s > 5) |> sort
Let’s keep going! Notice that you wouldn’t write map(s => square(x))
, right? You would write map(square)
. We can do something similar above, but we have to use function composition, you know, (f o g)x
is f(g(x))
, so:
people |> map (to_upper o name) |> filter (s => length s > 5) |> sort
Here are three things to read to get the gist of functional programming:
With functional programming:
Some people like to say:
Many languages have a neat little thing called comprehensions that combine map and filter.
sorted(p.name.upper() for p in people if len(p.name) > 5)
Logic programming and constraint programming are two paradigms in which programs are built by setting up relations that specify facts and inference rules, and asking whether or not something is true (i.e. specifying a goal.) Unification and backtracking to find solutions (i.e.. satisfy goals) takes place automatically.
Languages that emphasize this paradigm: Prolog, GHC, Parlog, Vulcan, Polka, Mercury, Fnil.
One of the characteristics of a language is its support for particular programming paradigms. For example, Smalltalk has direct support for programming in the object-oriented way, so it might be called an object-oriented language. OCaml, Lisp, Scheme, and JavaScript programs tend to make heavy use of passing functions around so they are called “functional languages” despite having variables and many imperative constructs.
There are two very important observations here:
“OOP to me means only messaging, local retention and protection and hiding of state-process, and extreme late-binding of all things. It can be done in Smalltalk and in LISP. There are possibly other systems in which this is possible, but I'm not aware of them.”
We’ve covered: