HyperText Transfer Protocol

What is HTTP?

HTTP == HyperText Transfer Protocol

HTTP is the protocol of the web

HTTP is a plain text protocol (as opposed to a binary protocol) that features key-value metadata in the form of headers and sometimes a body

Other protocols:

HTTP Requests and Responses

The components of HTTP are Requests and Responses

How the web gets to you:

  1. you type in a URL or click on a link
  2. the browser generates an HTTP request based on your settings
  3. the browser sends the request to the server
  4. the server processes the request
  5. the server generates a response in answer to your request
  6. this response gets sent to your browser
  7. your browser processes and (if applicable) displays this response
  8. you watch dancing hamsters

In practice, its more complicated than that, as for each page the browser will request CSS, JavaScript, images, and JavaScript on a page may perform asynchrnous requests after page load.

However, each of these is an HTTP request process which is handled the same way.

Anatomy of an HTTP Request

A simple HTTP request:

telnet 127.0.0.1 80
GET / HTTP/1.1

What this does:

This is the simplest request you can do. Normally, browsers (also called user agents) send headers with the request that specify their capabilities and other metadata about what they're requesting.

Anatomy of the response

HTTP/1.0 200 OK
Server: PasteWSGIServer/0.5 Python/2.6.5
Date: Sat, 07 Aug 2010 01:59:59 GMT
Content-Type: text/html
Accept-Ranges: bytes
Last-Modified: Thu, 08 Jul 2010 23:07:29 GMT
ETag: 1278630449.7-2870
Content-Range: bytes 0-2869/2870
Content-Length: 2870

<html>
  <head>
    <title>Jeff's cryptomythic web page</title>
...

Let's break this down:

Then come headers, a (Name: value) pair, one per line, separated with a ':':

Server
name of the web server (PasteWSGIServer/0.5 Python/2.6.5)
Date
when the message was sent
Content-Type
what is being returned (text/html). Browsers will process and display the content based on this field
Last-Modified
self explanatory; the last time the page was modified
ETag
an arbitrary tag supplied by the server used for caching
Content-Length
The length of the response in (8-bit) bytes.

Last-Modified and ETag (amongst other headers) are used by browsers with a HEAD request to determine if the page needs to be reloaded

Different servers will return different headers based on the response

User agents (browsers) render the response based on the headers returned

Then comes the body of the response (HTML)

Note: lines are terminated in HTTP by carriage return + linefeed characters (CRLF). This is how windows files are terminated. On linux, there is only the linefeed character

A Lexicon of HTTP

HTTP Verbs

These are things a request can do, also called methods

GET
read a resource
POST
update a resource and misc

↑ these are the only types you can do in a form

PUT
create a resource
HEAD
headers for a resource
DELETE
delete a resource

Some of these (for example, HEAD) are done internally by browsers. They are useful for AJAX requests and for making RESTful web services.

DAV (=Distributed Authoring and Versioning) adds even more verbs to these

Status Codes

The server returns a number indicating the status of the reponse.

200 series: Ok

300 series: redirection

See redirection

400 series: Client Error

The server has a problem with your request

500 series: Server Error

Something Bad Happen on the server

These are just a few of the total codes chosen because they are the most common used.

Redirection

The 300 series

The most common redirection codes:

When to use what?

How long to persist permanant redirects?

That's a good question! As long as one needs to, and no longer

Authorization

Request Headers

Accept
What forms (usually MIMEtypes) the user-agent will accept or prefer

Example: Accept: text/html,application/xhtml+xml

Content-Length
For POST requests -- how much data are you sending?
Referer
Which page did you come from? (e.g. if you click on a link)
User-Agent
how your browser chooses to identify itself

...etc

Response Headers

Content-Disposition
inline or attachment: filename=${name}

URLs

http://k0s.org/mozilla/craft/?show_header=Accept,Host,Unicorn&blink=true&black#end

Example service: webcalc

slashes are used for division

Source: http://k0s.org/hg/webcalc

http://127.0.0.1:5151/8/4

yields

2.0
http://127.0.0.1:5151/80*z*sin%282.*3.14*x%29**y?z=8..10&y=1..4&x=0..0.1..1

yields

x,y,z,result
0.0,1.0,8.0,0.0
0.0,1.0,9.0,0.0
0.0,1.0,10.0,0.0
0.0,2.0,8.0,0.0
0.0,2.0,9.0,0.0
0.0,2.0,10.0,0.0
0.0,3.0,8.0,0.0
0.0,3.0,9.0,0.0
0.0,3.0,10.0,0.0
0.0,4.0,8.0,0.0
0.0,4.0,9.0,0.0
0.0,4.0,10.0,0.0
0.1,1.0,8.0,376.017616457
0.1,1.0,9.0,423.019818514
0.1,1.0,10.0,470.022020571
0.1,2.0,8.0,220.920699822
0.1,2.0,9.0,248.535787299
0.1,2.0,10.0,276.150874777
0.1,3.0,8.0,129.796992145
0.1,3.0,9.0,146.021616163
0.1,3.0,10.0,162.246240182
0.1,4.0,8.0,76.2593056402
0.1,4.0,9.0,85.7917188452
0.1,4.0,10.0,95.3241320503
0.2,1.0,8.0,608.550054724
0.2,1.0,9.0,684.618811565
0.2,1.0,10.0,760.687568405
0.2,2.0,8.0,578.645576726

...

Warning: this code has not been audited for security!

Cookies

Request Header

Cookie:

Response Header

Set-Cookie: 

What is REST?

REpresentational State Transfer

Because HTTP is stateless (with the exception of cookies), it makes a great vehicle for writing functional services!

Why is it important to use the right code

REST vs. non-REST: Google translate

How it works:

How a RESTful API would work:

Firstly, it'd be better if it would translate a website for you

A truly RESTful web service would take a POST request with a from and a to field and the body of the request would be the document to be translated. The service would repond with the translated document.

This way, it could be easily interacted with by a computer!

Useful Tools

Using curl

Reading headers with curl:

curl -I http://k0s.org/

See man curl for more details