C is notorious for being an unsafe language. But this is by design! Programming in C requires discipline and a lot of detailed knowledge of memory management and other low-level concepts. There is much to learn!
Background
These notes will assume you’re a fairly capable C programmer. In particular, you should already know:
Why C is unsafe
The different data types of the language
How storage is segmented into three regions: Static, Stack, and Heap
What a pointer is, and the basics of pointer arithmetic
How to allocate memory in each region
The difference between arrays and pointers
That primitives and structs are stored and passed by value
That arrays, when assigned, passed as arguments, or returned from functions are treated as pointers.
How strings are stored and manipulated
The dangers of pointers to stack variables
The most basic information about memory leaks and dangling pointers
The Biggest Security Problems in C
Remember that C is by design NOT a memory-secure programming language. It’s not a good choice for high-level applications, but it might be the best you have for system and embedded software.
Memory safety is the biggest problem. The language does not prevent buffer overruns (underflow and overflow) and format string attacks. These things still happen today.
For an excellent overview of security concerns in C, see Topics 1 and 2 of the Michael Hicks’s amazing software security course.
Watch each of the videos for these two weeks (note that Topic 1 is black hat while Topic 2 is white hat):
1.1 Introduction to Low Level Security
1.2 Memory Layout
1.3 Buffer Overflow
1.4 Code Injection
1.5 Other Memory Exploits
1.6 Format String Vulnerabilities
2.1 Introduction to defenses against low level attacks
2.2 Memory Safety
2.3 Type Safety
2.4 Avoiding Exploitation
2.5 Return Oriented Programming (ROP)
2.6 Control Flow Integrity
2.7 Secure Coding
As you go through these videos, make notes of:
The memory model
How buffer overflows work (at a very high level)
How memory safety (both temporal and spatial) are defined
How type safety is defined
The cat and mouse game of defenders trying to prevent memory attacks by building into the
hardware, operating system, and run-time environments techniques such as:
Canaries: A canary is a special value written somewhere in a stack frame, so that buffer overflow attacks will almost assuredly overwrite it and hence be detected.
W+X: Hardware support for regions of memory that are either writable XOR executable. (See why this works?)
ASLR (Address Space Layout Randomization): Randomizing the location of memory blocks so an attacker cannot craft an exploit assuming to find certain data at a known address.
CFI (Control Flow Integrity): An assurance that a program only goes through
predetermined points of execution.
Avoiding problems via secure coding techniques!
If you don’t have to use C, maybe you can just use Rust.
Exercise: Learn about the programming language Rust. Does it try to replace C? Or C++? Is it as efficient? What is its approach to memory safety (in three sentences or less)?
Other languages that do many things C does are Zig and Odin.
C Secure Programming Resources
Where else can we learn how to be a security-minded developer? Here are some resources:
We’ll say a few words about the latter in the next section. But first, a case study of sorts. The curl team writes really safe C. How do they do it? Watch:
CERT-C
There’s a fantastic amount of information at The SEI CERT C Coding Standard from the Software Engineering Institute at Carnegie Mellon.
The Standard is organized into a number of Guidelines, divided into Rules and Recommendations, grouped into sections numbered and titled as follows:
01 (PRE) Preprocessor
02 (DEC) Declarations and Initialization
03 (EXP) Expressions
04 (INT) Integers
05 (FLP) Floating-Point
06 (ARR) Arrays
07 (STR) Characters and Strings
08 (MEM) Memory Management
09 (FIO) Input/Output
10 (ENV) Environment
11 (SIG) Signals
12 (ERR) Error Handling
13 (API) Application Programming Interfaces
14 (CON) Concurrency
48 (MSC) Miscellaneous
50 (POS) POSIX
51 (WIN) Microsoft Windows
Rules vs. Recommendations
Rules are basically requirements that if violated will almost surely result in an exploitable vulnerability. They are generally in principle checkable by static analysis tools (or by a competent human code reviewer). Recommendations basically improve software quality, but violations are not necessarily defects.
Examples of Rules:
STR31-C. Guarantee that storage for strings has sufficient space for character data and the null terminator
DCL36-C. Do not declare an identifier with conflicting linkage classifications
FIO42-C. Close files when they are no longer needed
MEM31-C. Free dynamically allocated memory when no longer needed
MSC41-C. Never hard code sensitive information
Examples of Recommendations:
DCL05-C. Use typedefs of non-pointer types only
MEM11-C. Do not assume infinite heap space
EXP12-C. Do not ignore values returned by functions
FIO21-C. Do not create temporary files in shared directories
Reading the whole standard can be worthwhile too! There’s sime nice introductory material, explanations of the standard’s organization and how to get the most from it, some history, and discussion of its relationship with other standards and publications. The Back Matter section has some good stuff summarizing useful bits from the C Language Specification and a listing of some tools commonly used in industrial C programming.
CLASSWORK
We’re going to browse a few of these guidelines!
The Safer-C Landscape
Let’s gather up guidelines, recommendations, and alternatives. Here’s a table mostly generated by a friendly AI chat bot (the one from OpenAI):
Here are some questions useful for your spaced repetition learning. Many of the answers are not found on this page. Some will have popped up in lecture. Others will require you to do your own research.
What is perhaps the most unsafe aspect of C?
It lacks memory safety.
Who is the author of the software security course is recommended in these notes?
Michael Hicks
What are some back hat topics in low level security?
Buffer overflow, code injection, format string attacks.
What are some white hat topics in low level security?
Memory safety, type safety, return oriented programming (ROP), control flow integrity (CFI), secure coding practices.
What is the difference between spatial and temporal memory safety?
Spatial memory safety means that a program cannot read or write outside the bounds of allocated memory. Temporal memory safety means that a program cannot use memory after it has been freed.
What is an example of a lack of type safety in low level code?
A program that uses a pointer to an integer as a pointer to a character.
What are techniques supplied by compilers and operating systems to compensate for poorly written software with memory vulnerabilities?
Canaries, W+X, ASLR, CFI.
What is a canary?
A special value written somewhere in a stack frame, so that buffer overflow attacks will almost assuredly overwrite it and hence be detected.
What is W+X?
Hardware support for regions of memory that are either writable XOR executable.
Why does W+X defeat the traditional naïve buffer overflow attack?
Because the attacker cannot write executable code to the stack.
What is ASLR?
Address Space Layout Randomization: Randomizing the location of memory blocks so an attacker cannot craft an exploit assuming to find certain data at a known address.
What is CFI?
Control Flow Integrity: An assurance that a program only goes through predetermined points of execution.
What are some low level languages that are alternatives to C?
Rust, Zig, Odin.
What is the name of the free book by David A. Wheeler?
The Software Engineering Institute at Carnegie Mellon.
What is the difference between rules and recommendations in the CERT-C Coding Standard?
Rules are requirements that if violated will almost surely result in an exploitable vulnerability. Recommendations basically improve software quality, but violations are not necessarily defects.
About how many rules and recommendations are present in CERT-C today?
120 rules and 186 recommendations.
What are the major areas addressed in CERT-C? (Name just a few.)