Values

To a machine, information is just bits. To a human, information wants to have meaning. It makes sense to give a name to meaningful chunks of information. In computing, the name that has emerged for this is value.

What is a Value?

Values are the meaningful units of information.

Some values are atomic (meaning they cannot be decomposed); these include numbers, symbols, and characters. Nonatomic (decomposable) values include tuples, sequences, strings, records, sets, dictionaries, and references.

Let’s take a tour.

Numbers

Numbers describe quantity, order, and measure. The simplest kinds of numbers are one-dimensional (scalar). One-dimensional numbers come in various forms, such as:

What?

If you’re not familiar with these terms, or need a refresher, see these notes.

A numeral is a representation of a number. Many systems of numerals exist in various human cultures. The most widely used numeral systems are positional. Positional numerals can be written in binary, octal, decimal, hexadecimal, and perhaps in other bases. Some programming languages allow underscores in numerals to make them more readable. Some languages have lots of different sizes for numbers and some don’t.

Here are some examples taken from various programming languages:

21, 1_000_000, 3838021212885321800888128, 0x17F, 0b101111111, 0o577, 223u8, 32767i16, 0x7FFF_0000u32, 3i32, 55u32, 3i64, 3.14159, 3.14159f32, 88.3f64, 60.2e22, 1.602176634e-19, 3E+11, 9355E-22, 16#5FDE3.A33#e+80, 13#A339#.

Multidimensional numbers, like ratios, complex numbers, quaternions, and octonions, are generally defined with tuples or records, which we’ll cover later.

Symbols

Symbols, also known as atoms, are indivisible things assigned a meaning by their creator. Most programming languages have the atoms true and false (they might be called True and False or T and F) to represent truth and falsity. Other common atoms are null, None, and nil (for the absence of information), and undefined (for the absence of knowledge—unknown, don’t care, or none-of-your-business). In many languages, you can create your own atoms, for example: left, right, up, down, red, green, blue, ready, sent, received.

Some languages require atoms to be prefixed with a colon, e.g., :sent. Sometimes you need an apostrophe as a prefix, e.g., 'sent. There are so many variations.

Characters

A character is a unit of textual information. A character has a name. Examples:

A grapheme is a minimally distinctive unit of writing in some writing system. It is what a person usually thinks of as a character. However, it may take more than one character to make up a grapheme. For example, the grapheme:

is made up of two characters (1) LATIN CAPITAL LETTER R and (2) COMBINING RING ABOVE. The grapheme:

நி

is made up of two characters (1) TAMIL LETTER NA and (2) TAMIL VOWEL SIGN I. This grapheme:

🚴🏾

is made up of two characters (1) BICYCLIST and (2) EMOJI MODIFIER FITZPATRICK TYPE-5. This grapheme:

🏄🏻‍♀

is made up of four characters (1) SURFER, (2) EMOJI MODIFIER FITZPATRICK TYPE-1-2, (3) ZERO-WIDTH JOINER, (4) FEMALE SIGN. And this grapheme:

🇨🇻

requires two characters: (1) REGIONAL INDICATOR SYMBOL LETTER C and (2) REGIONAL INDICATOR SYMBOL LETTER V. It’s the flag for Cape Verde (CV).

Exercise: What two characters would you use to make the Spanish flag?
Exercise: What is the difference between a character and a grapheme?

When characters are included into a character set, they are assigned a code point. In Unicode, a few hundred thousand characters have been mapped to code points already, and more get added from time to time. Traditionally, code points are written in hex (but they don’t have to be). Here are some examples:

   25 PERCENT SIGN
   2C COMMA
   54 LATIN CAPITAL LETTER T
   5D RIGHT SQUARE BRACKET
   B0 DEGREE SIGN
   C9 LATIN CAPITAL LETTER E WITH ACUTE
  2AD LATIN LETTER BIDENTAL PERCUSSIVE
  39B GREEK CAPITAL LETTER LAMDA
  446 CYRILLIC SMALL LETTER TSE
  543 ARMENIAN CAPITAL LETTER CHEH
  5E6 HEBREW LETTER TSADI
  635 ARABIC LETTER SAD
  71D SYRIAC LETTER YUDH
  784 THAANA LETTER BAA
  94A DEVANAGARI VOWEL SIGN SHORT O
  9D7 BENGALI AU LENGTH MARK
  BEF TAMIL DIGIT NINE
  D93 SINHALA LETTER AIYANNA
  F0A TIBETAN MARK BKA- SHOG YIG MGO
 11C7 HANGUL JONGSEONG NIEUN-SIOS
 1293 ETHIOPIC SYLLABLE NAA
 13CB CHEROKEE LETTER QUV
 2023 TRIANGULAR BULLET
 20A4 LIRA SIGN
 20B4 HRYVNIA SIGN
 2105 CARE OF
 213A ROTATED CAPITAL Q
 21B7 CLOCKWISE TOP SEMICIRCLE ARROW
 2226 NOT PARALLEL TO
 2234 THEREFORE
 2248 ALMOST EQUAL TO
 265E BLACK CHESS KNIGHT
 30FE KATAKANA VOICED ITERATION MARK
 4A9D HAN CHARACTER LEATHER THONG WOUND AROUND THE HANDLE OF A SWORD
 7734 HAN CHARACTER DAZZLED
 99ED HAN CHARACTER TERRIFY, FRIGHTEN, SCARE, SHOCK
 AAB9 TAI VIET VOWEL UEA
1201F CUNEIFORM SIGN AK TIMES SHITA PLUS GISH
1D111 MUSICAL SYMBOL FERMATA BELOW
1D122 MUSICAL SYMBOL F CLEF
1F08E DOMINO TILE VERTICAL-06-01
1F001 SQUID
1F0CE PLAYING CARD KING OF DIAMONDS
1F382 BIRTHDAY CAKE
1F353 STRAWBERRY
1F4A9 PILE OF POO

When representing character values in a programming language, we are sometimes, but not always, able to use graphemes directly, but we can always use code points. To distinguish character values from symbols, apostrophes are generally required:

You will definitely prefer to use code points for “invisible characters” such as HAIR SPACE, EM SPACE, EN SPACE, FOUR-PER-EM SPACE, THIN SPACE, NO-BREAK SPACE, ZERO WIDTH SPACE, LEFT-TO-RIGHT MARK, RIGHT-TO-LEFT MARK, WORD JOINER, INVISIBLE TIMES, BACKSPACE, HORIZONTAL TABULATION, END OF LINE, FORM FEED, CARRIAGE RETURN, END OF TRANSMISSION BLOCK, ESCAPE, FILE SEPARATOR, GROUP SEPARATOR, etc. Otherwise people looking at your code will be really confused. Many languages provide alternatives for some characters, the most common are:

Here are more extensive notes on characters, that even venture into how characters are encoded into bits for storage and transmission.

Tuples

A tuple is a value that is a finite, ordered collection of values:

Sequences

A sequence, also called a list, is a possibly infinite ordered collection of values. In practice, we usually think of lists as being collections of elements all “of the same type” but this is not strictly necessary. Conventionally, lists are delimited with square brackets while tuples use parentheses. Here are some lists:

Strings

A string is a sequence of characters. Strings are used to represent text. Here are some examples:

In some programming languages, characters and strings-of-one-character are indistinguishable. In others, they are completely distinct things which cannot be mixed up at all. This is interesting. It is one of the reasons why learning programming languages is both 😵‍💫 and 🤗.

Records

Briefly, a record is a tuple whose components are named. Examples:

Sometimes the delimiters are braces instead of parentheses. Sometimes the separator is an equal sign instead of a colon. There are many variations. Record components go by many names, including fields, slots, attributes, or members. There may be other names. The vocabulary can get pretty rich.

It’s sometimes nice to view records pictorially. Here’s one of the records we saw above:

name
"Jewel Lloyd"
team
"SEA"
games
38
points
939

Interestingly, lists, and even tuples, can be viewed as records, because they are ordered:

0
"mark"
1
"ready"
2
"set"
3
"go"

In most programming languages, lists and records are quite distinct. In JavaScript, they are blurred together into the concept of an “object”; in Lua, they are blurred together into the concept of a “table.” Regardless of what a programming language does, you should know and understand the language-independent concepts.

Sets

A set is an unordered collection of unique values. Examples:

Exercise: Explain why {1, 2}, {2, 1}, and {1, 2, 1} are all the same set.

Dictionaries

A dictionary, also known as a map, is an unordered collection of key-value pairs, designed for looking up the value associated with a given key, in which all of the keys are distinct. Here are some examples:

In practice, dictionary key types are often limited to symbols, numbers, and strings (we used all three above), but some programming languages allow additional kinds of keys, but still usually a restricted set of kinds. Typically, languages require keys to be “hashable”.

Dictionaries look like records, but the intent of a record is to represent a thing, while a dictionary represents a collection of things.

Exercise: Make sure you understand the difference between a record and a dictionary.
Exercise: Research the following variations of dictionaries: the ordered dictionary and the multidictionary.

References

Here are two records:

name
"Ani"
birthdate
"2019-03-03"
city
"Seattle"
pet
name
Sparky
breed
"G-SHEP"
weightInKg
33.5
primaryColor
"brown"
secondaryColor
"black"
name
"Aaron"
birthdate
"2020-09-05"
city
"Seattle"
pet
name
Sparky
breed
"G-SHEP"
weightInKg
33.5
primaryColor
"brown"
secondaryColor
"black"

Nice, but wait—do these two kids have the same pet, or two different pets that coincidentally happen to have the same name, breed, weight, and colors? The picture suggests two distinct pets. If the kids share the same pet, we’d want this picture:

name birthdate city pet "Ani" "2019-03-03" "Seattle" name birthdate city pet "Aaron" "2020-09-05" "Seattle" name breed weightInKg primaryColor secondaryColor "Sparky" "G-SHEP" 33.5 "brown" "black"

This picture illustrates a new kind of value, called a reference. A reference value is pictured as an arrow that refers to another value, called its referent. Here is a referent referring to the string value "Hi!":

"Hi!"

In some, but by no means all, languages, you make a reference to a value with &, for instance &"Hi!". Given a reference r, you get its referent with *r. These are by far the most common notations, but as always, beware, as many syntactic variations do exist.

It gets worse, though. Some programming languages make it hard to tell whether you even have a reference or not! Some languages implicitly create references for you and implicitly dereference them (that is, get the referent). It can get really confusing. That’s why you should really learn this stuff at a deep level. It is imperative that you understand how each language you program in handles references. We’ll have much more to say about these things later.

Something wicked

Here is something just awful:

It is a reference with no referent. It is called the null reference, known also as The Billion Dollar Mistake, and The Worst Mistake of Computer Science. Avoid this demon 👹 at all costs. It is beyond disgusting. It has caused great pain 😖 and economic loss 💸. It should never, ever, have been allowed to exist 🤮😢.

Exercise: Read the article The Worst Mistake In Computer Science.

A billion dollars was the estimate in 2009:

Classifying Values

Did you notice that when looking at values just now, we couldn’t help but classify them into numbers, characters, symbols, sequences, records, etc? The classification was very informal, but...it was hard to miss. There is, though, a very rigorous notion behind this classification. Every value, it turns out, has a type.

Types are one of the most important concepts in Programming Language Theory. They feel obvious and simple, but the theory of types is so vast and so fundamental to computing and programming languages, and the theory so deep, we’ll be covering them later.

Recall Practice

Here are some questions useful for your spaced repetition learning. Many of the answers are not found on this page. Some will have popped up in lecture. Others will require you to do your own research.

  1. What are some kinds of values?
    Numbers, symbols, characters, tuples, records, sequences, strings, sets, records, dictionaries, references.
  2. What is the difference between a number and a numeral?
    A number is a value, a numeral is a representation of a number.
  3. Symbols are also known as ________________.
    atoms.
  4. A character is a unit of ________________, while a grapheme is a unit of ________________.
    textual information, writing.
  5. Character sets assign a unique ________________ to each character.
    code point.
  6. What are various ways to write the character LATIN SMALL LETTER C? (Hint: its code point is 99, or 0x63.)
    'c', 'x63', '\u0063', '\U00000063', '\u{63}'
  7. Must tuples be finite? Are they ordered?
    Yes, yes.
  8. Must lists be finite? Are they ordered?
    No, yes.
  9. A string is a sequence of ________________.
    characters.
  10. Records are tuples whose components are ________________.
    named.
  11. What are various terms used to describe the components of a record?
    Fields, attributes, properties, members.
  12. What is a set?
    An unordered collection of unique values.
  13. What is a dictionary?
    A collection of key-value pairs in which all of the keys are distinct.
  14. How are records different from dictionaries?
    Records are used to describe a particular thing by its properties; dictionaries generally describe a single property of many things.
  15. What do references help us describe?
    The sharing of values.
  16. How does one conventionally write a reference to the value 5?
    &5
  17. Some languages unwisely allow references without a referent. What is the technical name given to such a reference, and what pejorative phrase does its creator describe it with?
    A null reference, the billion-dollar mistake.

Summary

We’ve covered:

  • What exactly a value is
  • Atomic values: numbers, symbols, characters
  • Compound values: tuples, sequences, strings, records, sets, dictionaries, references
  • Classifying values