Secure C
C is notorious for being an unsafe language. But this is by design! Programming in C requires discipline and a lot of detailed knowledge of memory management and other low-level concepts. There is much to learn!
Background
These notes will assume you’re a fairly capable C programmer. In particular, you should already know:
- Why C is unsafe
- The different data types of the language
- How storage is segmented into three regions: Static, Stack, and Heap
- What a pointer is, and the basics of pointer arithmetic
- How to allocate memory in each region
- The difference between arrays and pointers
- That primitives and structs are stored and passed by value
- That arrays, when assigned, passed as arguments, or returned from functions are treated as pointers.
- How strings are stored and manipulated
- The dangers of pointers to stack variables
- The most basic information about memory leaks and dangling pointers
The Biggest Security Problems in C
Remember that C is by design NOT a memory-secure programming language. It’s not a good choice for high-level applications, but it might be the best you have for system and embedded software.
Memory safety is the biggest problem. The language does not prevent buffer overruns (underflow and overflow) and format string attacks. These things still happen today.
For an excellent overview of security concerns in C, see weeks 1 and 2 of the Coursera course Software Security with Michael Hicks.
Watch each of the videos for these two weeks (note that week 1 is black hat, week 2 is white hat):
- 1.1 Introduction to Low Level Security
- 1.2 Memory Layout
- 1.3 Buffer Overflow
- 1.4 Code Injection
- 1.5 Other Memory Exploits
- 1.6 Format String Vulnerabilities
- 2.1 Introduction to defenses against low level attacks
- 2.2 Memory Safety
- 2.3 Type Safety
- 2.4 Avoiding Exploitation
- 2.5 Return Oriented Programming (ROP)
- 2.6 Control Flow Integrity
- 2.7 Secure Coding
As you go through these videos, make notes of:
- The memory model
- How buffer overflows work (at a very high level)
- How memory safety (both temporal and spatial) are defined
- How type safety is defined
-
The cat and mouse game of defenders trying to prevent memory attacks by building into the
hardware, operating system, and run-time environments techniques such as:
-
Canaries: A canary is a special value written somewhere in a stack frame, so that
buffer overflow attacks will almost assuredly overwrite it and hence be detected.
-
W+X: Hardware support for regions of memory that are either writable XOR
executable. (See why this works?)
-
ASLR: Randomizing the location of memory blocks so an attacker cannot craft an
exploit assuming to find certain data at a known address.
-
CFI (Control Flow Integrity): An assurance that a program only goes through
predetermined points of execution.
- Avoiding problems via secure coding techniques!
If you don’t have to use C, maybe you can just use Rust.
Exercise: Learn about the programming language Rust. Does it try to replace C? Or C++? Is it as efficient? What is its approach to memory safety (in three sentences or less)?
C Secure Programming Resources
Where else can we learn how to be a security-minded developer? Here are some resources:
Let’s say a few words about the latter in the next section.
CERT-C
There’s a fantastic amount of information at
The SEI CERT C Coding Standard
from the Software Engineering Institute at Carnegie Mellon.
The Standard is organized into a number of Guidelines, divided into
Rules and Recommendations, grouped into sections numbered and titled as
follows:
- 01 (PRE) Preprocessor
- 02 (DEC) Declarations and Initialization
- 03 (EXP) Expressions
- 04 (INT) Integers
- 05 (FLP) Floating-Point
- 06 (ARR) Arrays
- 07 (STR) Characters and Strings
- 08 (MEM) Memory Management
- 09 (FIO) Input/Output
- 10 (ENV) Environment
- 11 (SIG) Signals
- 12 (ERR) Error Handling
- 13 (API) Application Programming Interfaces
- 14 (CON) Concurrency
- 48 (MSC) Miscellaneous
- 50 (POS) POSIX
- 51 (WIN) Microsoft Windows
Rules are basically requirements that if violated will almost surely result in an exploitable
vulnerability. They are generally in principle checkable by static analysis tools (or by a
competent human code reviewer). Recommendations basically improve software quality, but violations
are not necessarily defects.
Examples of Rules:
-
STR31-C. Guarantee that storage for strings has sufficient space for character data and the null
terminator
- DCL36-C. Do not declare an identifier with conflicting linkage classifications
- FIO42-C. Close files when they are no longer needed
- MEM31-C. Free dynamically allocated memory when no longer needed
- MSC41-C. Never hard code sensitive information
Examples of Recommendations:
- DCL05-C. Use typedefs of non-pointer types only
- MEM11-C. Do not assume infinite heap space
- EXP12-C. Do not ignore values returned by functions
- FIO21-C. Do not create temporary files in shared directories
- MSC07-C. Detect and remove dead code
The online standard has
a page listing all of the rules
and a
page listing all of the recommendations. You can begin your study simply by reading the title of each of the 120 rules and 186
recommendations. Then dive into the ones of interest.
Reading the whole standard can be worthwhile too! There’s sime nice introductory material, explanations of the standard’s organization and how to get the most from it, some history, and discussion of its relationship with other standards and publications. The Back Matter section has some good stuff summarizing useful bits from the C Language Specification and a listing of some tools commonly used in industrial C programming.
CLASSWORK
We’re going to browse a few of these guidelines!
Summary
We’ve covered:
- How C is insecure by design
- Known problems
- Resources for secure C programming
- CERT C