Previously, we created our very own application layer protocols. The date and capitalization application protocols were rather trivial, and the our Tic-Tac-Toe protocol (TTTP) was a bit more interesting. Our chat application also had its very own protocol.
But there’s an existing, well-known protocol called HTTP (read about it at Wikipedia) that we can layer applications on top of!
We’ll study HTTP in more detail later, but for now, let’s use it to make an application where clients simply send a message to the server asking for the current datetime. HTTP is a stateless, request-response protocol. For our app, the client will send an HTTP request like:
GET /date
and the HTTP response will be something like:
2019-03-28T04:49:47.952Z
The first HTTP server we are going to write will run on port 59999 and feature two endpoints:
GET /
for returning an HTML “page” for the applicationGET /date
for returning the server’s current datetime as plain text.const http = require('http'); http.createServer((request, response) => { if (request.method === 'GET' && request.url === '/date') { response.writeHead(200, { 'Content-Type': 'text/plain' }); response.end(new Date().toISOString(), 'utf-8'); } else if (request.method === 'GET' && request.url === '/') { response.writeHead(200, { 'Content-Type': 'text/html' }); response.end('<h1>The Date Client</h1><a href="/date">Get date from server</a>'); } else { response.writeHead(404, { 'Content-Type': 'text/plain' }); response.end('Sorry, that’s not there'); } }).listen(59999); console.log('Date Server running at port 59999');
Discussion:
http
.Content-type
, which we used here to distinguish our plain text responses from our HTML ones.writeHead()
to write headers, and write()
and/or end()
to write the response body.Run the server:
$ node datewebserver.js Date Server running at port 59999
There are at least four ways to use this server. For illustration, we’ll run the server on localhost. For classwork, we’ll put the clients and servers on different machines.
First, we can just use nc
(and if you are taking a networking class, you should do this). With the server running on your local box:
$ nc localhost 59999 GET /date HTTP/1.1 200 OK Content-Type: text/plain Date: Thu, 28 Mar 2019 04:49:47 GMT Connection: close 2019-03-28T04:49:47.952Z
You have to hit the Enter key TWICE after entering the GET line. That is mandated by the protocol, as we’ll soon see.
That response was pretty large.
That’s right, an HTTP response contains a lot of metadata, which we’ll cover later. And yes, part of the response metadata is the server date, which might make you wonder why anyone would ever write a date server....
What about the other endpoint?
$ nc localhost 59999 GET / HTTP/1.1 200 OK Content-Type: text/html Date: Thu, 28 Mar 2019 05:05:11 GMT Connection: close <h1>The Date Client</h1><a href="/date">Get date from server</a>
And what about that 404 thing?
$ nc localhost 59999 GET /whatever HTTP/1.1 404 Not Found Content-Type: text/plain Date: Thu, 28 Mar 2019 05:06:58 GMT Connection: close Sorry, that’s not there
By the way, Node’s http module does some pretty good parsing of requests and handles a good deal of the protocol for you. These examples will just give you an idea of what’s in HTTP:
$ nc localhost 59999 bzusdfyiwuef HTTP/1.1 400 Bad Request
The second way is to use curl
. Use the -i
option to see the whole response:
$ curl -i -X GET localhost:59999/date HTTP/1.1 200 OK Content-Type: text/plain Date: Thu, 28 Mar 2019 05:08:42 GMT Connection: keep-alive Transfer-Encoding: chunked 2019-03-28T05:08:42.217Z
Without -i
you just get the response body:
$ curl -X GET localhost:59999/date 2019-03-28T05:10:15.931Z
Okay, yes, the third way is to use a web browser, which speaks HTTP already. With your server running on your local machine, enter http://localhost:59999
into the browser’s address bar, hit Enter, and see the page. Click on the link to hit the other endpoint.
Also, enter the URI of the date endpoint directly: http://localhost:59999/date
. That worked great, didn’t it?
Your favorite programming language should have an http library for making requests. If you are making requests programmatically, you’ll have to examine the response in some detail and take one of several actions depending on the status code. You’ll also have to check and process all the response headers, and read in the response body. The code can get clunky. Here’s how it looks in JavaScript using the raw http
library:
const http = require('http'); http.get('http://localhost:59999/date', (res) => { if (res.statusCode !== 200) { console.log(`Received status code ${res.statusCode}`); res.resume(); return; } let dateString = ''; res.on('data', (chunk) => { dateString += chunk; }); res.on('end', () => console.log(dateString)); }).on('error', (e) => { console.error(`Request failed: ${e.message}`); });
DON’T PANIC!!! If you npm install request
then it’s much, much easier:
const request = require('request'); request('http://localhost:59999/date', (error, response, body) => { if (error) { console.error(error); } else if (response.statusCode !== 200) { console.log(`Received status code ${response.statusCode}`); } else { console.log(body); } });
Oh wait, aren’t promises much more awesome than those silly callbacks?
$ npm install request request-promise --save
const rp = require('request-promise'); rp('http://localhost:59999/date').then((body) => { console.log(body); }).catch((error) => { console.log(error.message || 'Error'); });
Interesting.... request-promise
treats both non-sucess HTTP responses and failures to connect as reasons to reject a promise.
Let’s review the different ways we’ve hit the server so far:
nc
and typing in the HTTP request by hand.Now what if wanted to write our own web client, and not use the web page delivered by that web server? In other words: can we write a web app that uses the date web server just for its data, and not use its HTML?
This is what fetch
is for. Let’s try:
<html> <head> <meta charset="utf-8"> <title>Date Web Client</title> </head> <body> <p>IP Address of Server: <input id="ip" type="text"></p> <p><button>Get Current Date</button></p> <p id="response"></p> <script> document.querySelector('button').addEventListener('click', () => { const host = document.querySelector('#ip').value; const messageArea = document.querySelector('#response'); fetch(`http://${host}:59999/date`).then((res) => { messageArea.textContent = `${res.status} `; return res.text(); }).then((data) => { messageArea.textContent += data; }).catch((error) => { messageArea.textContent = error; }) }); </script> </body> </html>
Now if we load this file into the browser, enter localhost
into the IP box, and hit the button, we get:
Access to fetch at 'http://localhost:59999/date' from origin 'null' has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource. If an opaque response serves your needs, set the request's mode to 'no-cors' to fetch the resource with CORS disabled.
CORS? That’s Cross Origin Resource Sharing. Basically web browsers stop you from fetching resources from sites (host+port) other than that which served you the JavaScript you are running. How can you get around this? If you really were writing a web service, one that just delivers data and not web pages, then you can return a response header Access-Control-Allow-Origin
, allowing access to certain web clients or to everyone (with the value *
).
Classwork: Get into groups of three. One person wll run the server. The other two will use the HTML client and try to hit the server. Both should see CORS errors. Then the person writing the server will add'Access-control-allow-origin': 'http://ip_address_of_one_of_the_clients'into the argument towriteHead
for the/date
endpoint. Now only that one client will be able to hit the server. Next, the server programmer will change that header to'Access-control-allow-origin': '*'and the group should verify that both clients can now hit the server.
The other way to “get around” CORS is to just make these kinds of calls from other servers. It’s only the call from the browser that is impacted by the same-origin policy.
Let’s look at HTTP, the protocol, in a little more detail.
HTTP Request format:
Method SP RequestURI SP HTTPVersion CR LF |
ZERO OR MORE RESPONSE HEADERS |
CR LF |
OPTIONAL MESSAGE BODY |
Example:
PUT https://ehr.example.com/types/Lab.Diagnosis HTTP/1.1 authorization: Bearer hbGciOiJIUz8I1NiJ9JzdWIiOiJh content-type: application/json Origin: https://e1ak8fcq37hprs.cloudfront.net Referer: https://e1ak8fcq37hprs.cloudfront.net/ User-Agent: Mozilla/5.0 AppleWebKit/537.36Chrome/72.0.3626.96 {"display":"Lab Diagnosis","schema":{"code":"string"}}
HTTP Response format:
HTTPVersion SP StatusCode SP ReasonPhrase CR LF |
ZERO OR MORE RESPONSE HEADERS |
CR LF |
OPTIONAL MESSAGE BODY |
Example:
HTTP/1.1 200 OK access-control-allow-headers: Content-Type,Authorization access-control-allow-methods: GET,PUT,POST,DELETE,PATCH,HEAD,OPTIONS access-control-allow-origin: * content-encoding: gzip content-length: 94 content-type: application/json date: Thu, 28 Mar 2019 16:10:20 GMT status: 200 via: 1.1 9f6b9465776576cba700d600678836e.cloudfront.net (CloudFront) x-amz-apigw-id: XQq8-Edb8625FbxA= x-amzn-remapped-content-length: 92 x-amzn-requestid: fc22c7c7-5173-11e9-b44d-b393bfa59355 x-cache: Miss from cloudfront { "post_type": { "display": "Lab Diagnosis", "name": "Lab.Diagnosis", "schema": { "code":"string" } } }
More details at MDN. Also you should check out the RFCs.
HTTP was designed around the idea of an infinite number of types of resources (nouns) that are manipulated with a very small number of methods (verbs). The methods are:
Method | Brief Summary |
---|---|
GET | Requests a representation of the target resource |
HEAD | Identical to GET except that the server must not send a message body in the response |
POST | Requests that the target resource process the representation enclosed in the request according to the resource’s own specific semantics |
PUT | Requests that the state of the target resource be created or replaced with the state defined by the representation enclosed in the request message payload |
DELETE | Requests that the origin server remove the association between the target resource and its current functionality |
PATCH | Requests that a set of changes described in the request entity be applied to the resource identified by the Request-URI. The set of changes is represented in a format called a “patch document” identified by a media type |
CONNECT | Requests that the recipient establish a tunnel to the destination origin server identified by the request-target and, if successful, thereafter restrict its behavior to blind forwarding of packets, in both directions, until the tunnel is closed |
OPTIONS | Requests information about the communication options available for the target resource, at either the origin server or an intervening intermediary |
TRACE | Requests a remote, application-level loop-back of the request message |
Both requests and responses can be packed with metadata which are called headers. Here are some of the common ones:
Request Headers | A-IM Accept Accept-Charset Accept-Encoding Accept-Language Accept-Datetime Access-Control-Request-Method Access-Control-Request-Headers Authorization Cache-Control Connection Content-Length Content-Type Cookie Date Expect Forwarded From Host If-Match If-Modified-Since If-None-Match If-Range If-Unmodified-Since Max-Forwards Origin Pragma Proxy-Authorization Range Referer TE User-Agent Upgrade Via Warning |
---|---|
Response Headers | Accept-Patch Accept-Ranges Age Allow Alt-Svc Cache-Control Connection Content-Disposition Content-Encoding Content-Language Content-Length Content-Location Content-Range Content-Type Date Delta-Base ETag Expires IM Last-Modified Link Location Pragma Proxy-Authenticate Public-Key-Pins Retry-After Server Set-Cookie Strict-Transport-Security Trailer Transfer-Encoding Tk Upgrade Vary Via Warning WWW-Authenticate |
Check out the list of registered headers at IANA. Note that you can always add your own headers for your own organization or application as long as it starts with x-
.
Information is generated, transmitted, and stored as bit sequences. A media type describes the way in which bits are to be interpreted. Use media types as the values of the Content-type
and other headers. Examples:
text/html
image/png
audio/mp4
video/H264
Every HTTP response begins with a three-digit status code. Most are listed below. For a nice summary, see Wikipedia; for the official documentation, see RFC 7231, Section 6.
1xx : INFORMATIONAL | |
---|---|
100 | Continue |
101 | Switching Protocols |
102 | Processing |
103 | Early Hints |
2xx : SUCCESS | |
200 | OK |
201 | Created (usually you should set the Location header for this) |
202 | Accepted (used for asynch requests) |
203 | Non-Authoritative Information |
204 | No Content |
205 | Reset Content |
206 | Partial Content |
207 | Multi-Status |
208 | Already Reported |
226 | IM Used |
3xx : REDIRECT | |
300 | Multiple Choices |
301 | Moved Permanently |
302 | Found (SUPERSEDED BY 303 AND 307) |
303 | See Other |
304 | Not Modified |
305 | Use Proxy |
307 | Temporary Redirect |
308 | Permanent Redirect |
4xx : CLIENT ERROR | |
400 | Bad Request |
401 | Unauthorized |
402 | Payment Required |
403 | Forbidden |
404 | Not Found |
405 | Method Not Allowed (service doesn't support the requested method at that URI) |
406 | Not Acceptable (server can't give back a representation in a requested format) |
407 | Proxy Authentication Required |
408 | Request Timeout |
409 | Conflict |
410 | Gone |
411 | Length Required |
412 | Precondition Failed |
413 | Payload Too Large |
414 | URI Too Long |
415 | Unsupported Media Type (server can't process the request body) |
416 | Range Not Satisfiable |
417 | Expectation Failed |
418 | I'm a Teapot |
421 | Misdirected Request |
422 | Unprocessable Entity |
423 | Locked |
424 | Failed Dependency |
425 | Unordered Collection |
426 | Upgrade Required |
428 | Precondition Required |
429 | Too Many Requests |
431 | Request Header Fields Too Large |
451 | Unavailable For Legal Reasons |
5xx : SERVER ERROR | |
500 | Internal Server Error |
501 | Not Implemented |
502 | Bad Gateway |
503 | Service Unavailable |
504 | Gateway Timeout |
505 | HTTP Version Not Supported |
506 | Variant Also Negotiates |
507 | Insufficient Storage |
508 | Loop Detected |
509 | Bandwidth Limit Exceeded |
510 | Not Extended |
511 | Network Authentication Required |
Did you notice that our simple webserver had two endpoints, one returning an HTML page and the other just returning raw data? That’s great. HTTP is designed to accept and return resources, which can be raw data, text, HTML, images, videos, whatever.
Some HTTP-based servers are centered around delivering HTML and related content. These are often called webapps. Some are all about exchanging pure data, usually in JSON (or XML, which is still around). These are often called web services or web APIs. If you follow certain principles in your web API architecture, you might get to call your API a REST API.
In practice, you’d never write an application using the http
module directly. You would use a framework written by someone else. That someone else probably built their framework using http
directly so you don’t have to.
In no particular order, here are some frameworks that I happen to know about. They are offered without any endorsement:
Language | Frameworks |
---|---|
JavaScript | Express Koa Hapi Ember Angular React Vue |
Python | Django Pyramid Twisted Tornado Flask Sanic |
Ruby | Rails EventMachine Sinatra Padrino Hanami Cuba Goliath Scorched |
Go | Buffalo Iris Gin Martini Revel Gorilla Echo Web.go Goji Beego |
Java | SpringMVC Play Grails Wicket Vert.X |
Scala | Lift Play Finch Akka Chaos Scalatra BlueEyes |
Rust | Warp Thruster Rustful Rustless Tide Nickel Pencil Rocket Canteen Gotham |
Later in the course we’ll learn how to build some interesting web applications, using one or more of these frameworks. Remember that HTTP is just a request-response protocol; to build a complete web app there’s tons to learn about HTML, CSS, and JavaScript (for the front-end), and data stores and similar things (for the back-end). Much more to come.
HTTP is great for requests and responses. But what about applications like interactive games or chats, where the server has to notify the client at any time (not just as a response)? Here the overhead of the protocol is inefficient.
For games and notifications, WebSockets are almost always better!
Because HTTP is so massively popular, Node.js comes with a built-in module called http
to help you write HTTP servers! How convenient! Here are some of the module highlights:
Values | METHODS STATUS_CODES globalAgent maxHeaderSize |
---|---|
Functions | createServer() request() get() |
class Agent | |
Properties | freeSockets maxFreeSockets maxSockets requests sockets |
Methods | createConnection() keepSocketAlive() reuseSocket() destroy() getName() |
class Server | |
Events | checkContinue checkExpectation clientError close connect connection request upgrade |
Properties | headersTimeout listening maxHeadersCount timeout keepAliveTimeout |
Methods | close() listen() setTimeout() |
class IncomingMessage | |
Events | aborted close |
Properties | aborted complete headers httpVersion method rawHeaders rawTrailers socket statusCode statusMessage trailers url |
Methods | destroy() setTimeout() |
class ClientRequest | |
Events | abort connect continue information response socket timeout upgrade |
Properties | aborted connection finished maxHeadersCount path socket |
Methods | abort() end() flushHeaders() getHeader() removeHeader() setHeader() setNoDelay() |
class ServerResponse | |
Events | close finish |
Properties | connection finished headersSent sendDate socket statusCode statusMessage |
Methods | addTrailers() end() getHeader() getHeaders() getHeaderNames() hasHeader() removeHeader() setHeader() setTimeout() write() writeContinue() writeHead() writeProcessing() |
We’ve covered: