Software Security
There are many dimensions of security, but one of the biggest is that of developing secure software.
Unit Goals
To get a sense of how we can approach the design and development of secure software.
Secure Software
We can come up with several dimensions of security:
Personal security (people)
Information security (data)
Computer security (systems)
Software security (code)
We have to practice security in hardware, in software, and at the human level. We can employ technical solutions such as passwords, tokens, encryption, access control lists, permission matrices, antivirus software, firewalls, whitelists, blacklists, and security zones. We can train users to be security conscious.
But did you know that malicious actors can often get around many of these technical security “solutions” and do way more damage just by exploiting poorly crafted and buggy code? Because that's where 90% of reported security incidents come from. NINETY. PERCENT.
Secure Software
A lot of people think that security is mostly about defenses such as those mentioned above. But attackers can get around those! Software security considers the code: it is about disciplined software design and development. Build the software right the first time: Prevention is the thing to shoot for; detection and reaction are more expensive.
Exercise: Research the Heartbleed bug. Could it have been stopped by malware detectors, DoS detectors, packet filters, antivirus scanners, smart routers, operating system permissions? No? What was the problem?
The field of secure software development is concerned with the software development lifecycle (requirements definition, analysis, design, development, testing, configuration, deployment, maintenance), and security concerns in each phase, namely:
- How to design secure software, i.e., the principles to follow (simplicity, failing fast, setting trust boundaries), how to define security requirements (not always easy), how to do threat modeling, risk analysis, and employing techniques like domain-driven design
- The coding constructs supporting secure software, like immutability, encapsulation, error isolation, validation
- Checking code with linters, static analysis tools, and (human) code reviews
- Testing, specifically unit testing, integration testing, penetration testing, fuzz testing
- Operations, such as configuration, deployment, monitoring, disaster recovery
And of course, knowledge of known vulnerabilities and classes of attacks (injection, denial-of-service, out-of-range values, buffer overflows) are important too. We’ll need to learn about:
- Low-level, memory-based attacks, often done by overflowing (overwriting or overreading!) buffers on the stack or heap, or exploiting integer overflow. (Buffers are blocks of memory allocated for accepting input, perhaps from a network request or file read).These can be done by messing with format or regex patterns, too, or via all kinds of shenanigans with pointer arithmetic. In addition to overflows, low-level attacks can be carried out dangling pointers or even variable format strings.
- Web security, because the web is everywhere. Even mobile apps make, you guessed it, web service calls.
The Software Security Mindset
There are three big ideas that comprise the proper mindset for successful secure software development:
Build Security In
Define Security Requirements Properly
Defend Deeply and Broadly
Build Security In
Security concerns must always be on your mind in everything you do. It is not separate from software development. It is an integral part of software development. Why?
- If you’re the developer, you know everything about the system, so you’ll know how to defend.
- Secure software practices catch many vulnerabilities that you might not even know about (and will never encounter, because they won’t ever happen because your code is so good).
- Building secure domain objects and secure modules will generally completely replace tons of ad-hoc defenses. Create the right software. Manipulate the right entities from the right domain with the right constraints. Entities should do only what they are intended to do and no more. They should take on only sensible values. Secure software is better than security software, as they say! Be careful about going down the rabbit hole of throwing in an eclectic set of tools and defense mechanisms for specific threats. Structure your software according to best practices and you might not even need the specific defenses (e.g, XSS sanitizers).
Exercise: Wait, so what exactly is wrong with stuffing the code with tons of ad-hoc defenses?
- If you try to separate security from software development (leaving regular programming to the developers and security to others), you’ll most likely leave security to the end of the job or forget it altogether. Both lead to disaster:
- If you leave it until later, pentesters will find something and you’ll not be able to deploy on time. That is, if they are able to find anything at all, since they can do little more than just throw whatever they can at it (they didn’t write the source code, so they don’t know it as well as you). They’re human too, and they might miss things.
- If you skip security, you will get hacked and get destroyed
Defining Security Requirements
Sometimes requirements definition is hard. It is a learned skill to understand which are the actual requirements and which are just technical details. You can learn it, though. Which of these sounds correct as a requirement?
That was easy, right? Here’s another way to look at this issue: Don’t get too caught up in the fancy crypto and fancy math and digests and hashing at the expense of forgetting some basics (like not authenticating each endpoint in a web service).
Story Time
There was this web app where users had to login to get to pages in the app, and these pages linked to all sorts of images and other assets, which, you guessed it, just used plain old img
elements. So ... (story continues in class)
To check that the requirements have been satisfied, your development process should included simulated attacks (penetration testing), fuzz testing, and maybe even correctness proofs.
Defending in Depth and in Breadth
The idea of “defense in depth” is known in circles outside software security. Here’s the idea, from the Viega and McGraw book:
The idea behind defense in depth is to manage risk with diverse defensive strategies, so that if one layer of defense turns out to be inadequate, another layer of defense will hopefully prevent a full breach ... Security cameras alone are a deterrent for some. But if people don't care about the cameras, then a security guard is there to physically defend the bank with a gun. Two security guards provide even more protection. But if both security guards get shot by masked bandits, then at least there's still a wall of bulletproof glass and electronically locked doors to protect the tellers from the robbers. Of course if the robbers happen to kick in the doors, or guess the code for the door, at least they can only get at the teller registers, since we have a vault protecting the really valuable stuff. Hopefully, the vault is protected by several locks, and cannot be opened without two individuals who are rarely at the bank at the same time. And as for the teller registers, they can be protected by having dye-emitting bills stored at the bottom, for distribution during a robbery.
Layers in information security will include firewalls, anti-virus software, crypto, authentication mechanisms, authorization rules, signatures, and maybe even correctness proofs. Some of the defenses you will write yourself; others will come from libraries. If anything attacks do get though, you should have intrusion detection and forensic tools to help contain the breach and make repairs.
Exercise: Research canaries and honeypots. How do these help with intrusion detection? What else are they good for?
While depth defense is concerned with putting up a series of defenses that an attacker needs to break through, breadth defense understands an attack can go after many parts of a system. Fore example, think of ways can a denial of service, or a restriction in availability occur? This list can get you started:
- Insufficient network bandwidth
- The hard drives filling up
- Excessive memory paging or cache invalidations
- Hash collisions
- Deadlocks
- Livelocks
- Bad database queries that don’t use indexes
- Slow algorithms (e.g., worst case Quicksort is Θ(n²))
Now, how many of these come from a poor architecture? Poor coding? Attacker knowledge? They might come from all three, so we need to guard against these at multiple levels. Profile the code. Fuzz test!
Principles
Part of the mindset of doing security involves living a few basic principles that will become self-evident to you over time. These include:
- Set Trust Boundaries
- Protect resources with a series of gates, e.g., “this resource (database entity, API endpoint, private subsystem, etc.) is gated by X.” Gates include things like “only logged in users,” or “users with a particular permission” or “packets originating from a given IP range.” You can set up zones, too, and distinguish, say code on-the-edge (user-facing, untrusted) from code on-the-inside (data you control and can trust).
- Design for Least Privilege
- Give every user, subsystem, object, function, the absolute least privilege (permission) necessary to do its job. Always make secure the default, and open up little by little explicitly as needed.
- Maintain Integrity
- Restrict the domain of entities, e.g., a plain old int or string is often bad. Have preconditions, postconditions, and invariants enforced in the code. Treat every piece of input as a threat.
- Fail Fast
- Identify problems right away! Don’t let inconsistencies worm their way through the data. If something can’t be done, don’t do it. If you crash in the middle of an operation, clean up!
- Audit
- Log everything that happens, but don’t log any secrets. Also keep the logs secure!
- Don’t Rely on Secrets
- The more secrets you have, the more likely they will leak or be guessed. People might even divulge them (accidentally, maliciously, or because they are physically threatened).
- Keep It Simple
- Complexity introduces more possibility for errors and makes it hard to reason about security. All added bits of complexity introduces new attack vectors. The more inputs you have, the greater the attack surface. The more complex of inputs you allow (e.g., markup or formatted text or documents), the greater the attack surface. Attackers love going after your inputs.
- Prevent Leaks
- Don't leak error information. No PHP or SQL dumps on a user-visible error pages. Don’t let error information allow attackers to guess. Never say “Incorrect password” since that might imply the attacker guessed a user name. Prefer errors like “Not found” for permission errors (since an attacker might be trying to guess user names or resource ids).
Exercise: Research the phrase “Security Through obscurity” and make a list of all the reasons it is a bad thing.
Did you notice
None of these principles mentioned specific attacks like XSS or DDoS or Billion Laughs or SQL Injection.
That’s the point.
Exercise: The list of principles above is not the only one out there. Here are three others to read:
Which if any of principles from these lists are left out of the list above?
Tactics
Getting impatient wth all this high-level stuff? Wondering what to do in practice? Hang on, we’ll get there. In the meantime, here’s a list of some lower-level strategies and tactics:
- Validate all the inputs (types, bounds, origins, structure, meaning)
- Sanitize all the inputs (“input,” whether code, pattern, or markup, should never be “executable”)
- Favor immutability!
- If not immutable, make defensive copies when data comes in
- If not immutable, copy when sending data out
- Capture constraints in the domain classes (don’t rely on utility functions to do it)
- Don't duplicate code
- Understand the code you are copy-pasting from StackOverflow
- Maximize cohesiveness and minimize coupling
- Minimize your ifs and loops
- Create each operation to do one thing and do it well
- Make inputs smaller
- Clean up resources (e.g., in a finally clause)
- Clear out memory! (to avoid Heartbleed issue, but also O.S. hibernation that may write memory to disk)
- Don’t let resources get exhausted
- Don’t write sensitive information to logs
- Maintain access controls on log files
- Don’t overflow the logs
- Maintain invariants
- Avoid global variables
- Avoid side effects
- Make all readers idempotent
- Reads should just be reads, never write anything as a side effect (e.g., HTTP GET)
- Know what you are doing if you write concurrent code
- Prefer messaging to shared data (because locks are hard and error-prone)
- Explicitly mark references (to the extent your language allows it)
- Understand everything about character sets and character encoding
- Watch out for modular arithmetic wraparound
- Do bounds checking
- Check pointer dereference operations (don’t dereference nulls)
- Don’t double-free pointers
- If your language has unsafe or metaprogramming features (reflection, loaders, serializers), use sparingly if at all
- Treat responses from native code (e.g., Python, Lua, and Java can all wrap C) as external untrusted input
- Don’t rely on case-sensitivity of file names
- Know what is in your config files, and by all means encrypt these
See this amazing OWASP Secure Coding Practices Quick Reference Guide. It has a great checklist.
How about details? Where can we find examples of good (compliant with security guidelines) and bad (non-compliant) code? Next section!
Guidelines and Standards
It’s good to familiarize yourself with publications made by the pros. These can be (1) collections of known vulnerabilities and weakness, or (2) guidelines and coding standards that you follow so that the code you write is secure. Some are language-specific and some are pretty general. Here are some good ones:
- CWE: List of common weaknesses, together with some source-code mitigations you can employ
- CVE: Catalog of publicly disclosed security vulnerabilities
- CERT Coding Standards main page
- CERT C Coding Standard from SEI. Contains guidelines for writing secure C. Each guideline has examples of non-compliant and compliant code
- CERT C++ Coding Standard from SEI. Contains guidelines for writing secure C++. Each guideline has examples of non-compliant and compliant code
- CERT Coding Standard for Java
- MISRA guidelines for C and C++. Best practice guidelines, emphasizing those for embedded and safety-related systems
Many organizations provides summaries of many of these sources, which can be nice to browse before diving into the dense publications themselves. For example, Perforce has useful summaries and overviews of
Software security standards in general,
The CWE,
The CVE,
OWASP,
CERT C,
ISO 26262, and
MISRA C and C++.
Learning Software Security
Here are a few places to learn about Software Security as a discipline:
- Presentations:
- Courses:
- Coursera
- LinkedIn Learning
- Linux Foundation
- Troy Hunt
- Books:
- Online Guides:
How about some old school? Read this 1975 paper by Saltzer and Schroeder.
Summary
We’ve covered:
- Why secure software matters
- Building security in
- Defining security requirements
- Defending deeply and broadly
- General principles
- Specific tactics
- Important guidelines and standards to know about
- A few good resources