1 of 43

ECS 150: HTTP, URLs, REST, and JSON

Sam King

2 of 43

Administrative

Project 4 out on Wednesday and due on June 2nd

  • Reminder: no late days for this assignment even if you still have some left

Last time: Networking APIs

This time: HTTP and URLs

Next time: REST and JSON

3 of 43

Goal for today

Teach you the most common protocol

Help to understand the principles behind it in case you ever need to design / implement a different protocol

4 of 43

Problem: We didn’t know when to stop reading!

Why doesn’t this work?

We read in bytes and hoped for the best

5 of 43

Hypertext transfer protocol (HTTP)

Application-layer protocol for internet distributed systems

HTTP Goals:

  • Generic
  • Extensible
  • Stateless
  • Human readable

6 of 43

You have a client, or user agent

7 of 43

The client sends requests to a server

HTTP request

8 of 43

The server responds

HTTP response

9 of 43

Note: this is a bit different than TCP

TCP has the notion of a connection

You can send multiple request/response messages in a single TCP connection, and you can even pipeline messages

You don’t even need to use TCP

10 of 43

HTTP Messages

Request-Line | Status-Line

*(message-header CRLF)

CRLF

[ message-body ]

11 of 43

From last time, HTTP request

GET /hello_world.html HTTP/1.1<CRLF>

Host: localhost:8080<CRLF>

User-Agent: GunrockClient/1.0<CRLF>

Accept: */*<CRLF>

<CRLF>

12 of 43

Request line includes a method, URI, and HTTP ver

Request-Line | Status-Line

*(message-header CRLF)

CRLF

[ message-body ]

GET /hello_world.html HTTP/1.1

13 of 43

Message headers are key/value pairs

Request-Line | Status-Line

*(message-header CRLF)

CRLF

[ message-body ]

GET /hello_world.html HTTP/1.1

Host: localhost:8080

User-Agent: GunrockClient/1.0

Accept: */*

14 of 43

The client tells the server that it’s done with the headers by sending a blank line

Request-Line | Status-Line

*(message-header CRLF)

CRLF

[ message-body ]

GET /hello_world.html HTTP/1.1

Host: localhost:8080

User-Agent: GunrockClient/1.0

Accept: */*

15 of 43

This particular request didn’t include the optional message body, but this is where it would go

Request-Line | Status-Line

*(message-header CRLF)

CRLF

[ message-body ]

GET /hello_world.html HTTP/1.1

Host: localhost:8080

User-Agent: GunrockClient/1.0

Accept: */*

16 of 43

From last time the HTTP response

HTTP/1.1 200 OK

Content-Length: 141

Content-Type: text/html

Server: Gunrock Web

<!DOCTYPE html>

<html>

<head>

<title>Hello World</title>

</head>

<body>

<p>Hello ECS 150</p>

</body>

</html>

17 of 43

Request line includes a method, URI, and HTTP ver

Request-Line | Status-Line

*(message-header CRLF)

CRLF

[ message-body ]

HTTP/1.1 200 OK

18 of 43

HTTP response also includes headers and lets the client know the size of the message body

Request-Line | Status-Line

*(message-header CRLF)

CRLF

[ message-body ]

HTTP/1.1 200 OK

Content-Length: 141

Content-Type: text/html

Server: Gunrock Web

19 of 43

The body includes the content of the file that we requested

Request-Line | Status-Line

*(message-header CRLF)

CRLF

[ message-body ]

<!DOCTYPE html>

<html>

<head>

<title>Hello World</title>

</head>

<body>

<p>Hello ECS 150</p>

</body>

</html>

20 of 43

Taking a step back, what are some other protocols that we could have used?

21 of 43

If there are so many ways we could have done it, why did HTTP become ubiquitous?

22 of 43

Other examples

Thrift and Protobuf: binary formats for server-to-server RPC

23 of 43

Standard internet app architecture

The Internet

Frontend / load balancer / TLS termination

HTTP + JSON body

Business logic

...

Core services

Thrift / TCP

Thrift / TCP

24 of 43

Lecture ended here

25 of 43

Administrative

Project 4 out by Monday, due June 7th @ 8am

  • Still no late days available for this project

Last time: HTTP

This time: URLs, REST, and JSON

Next time: JSON and APIs

26 of 43

Uniform resource locators (URLs), full URL

27 of 43

Uniform resource locators (URLs), host name

https://bob.cs.ucdavis.edu:81/a/b/page?first=sam&last=king

Name of the machine running the HTTP server

28 of 43

Uniform resource locators (URLs), port

https://bob.cs.ucdavis.edu:81/a/b/page?first=sam&last=king

The port that the HTTP server is listening on

  • 80 is default for HTTP
  • 443 is default for HTTPS

29 of 43

Uniform resource locators (URLs), path

https://bob.cs.ucdavis.edu:81/a/b/page?first=sam&last=king

  • Hierarchical portion (path + page)
  • Used to specify file system path and static html page location
  • Now used as namespace for server

30 of 43

Uniform resource locators (URLs), query

https://bob.cs.ucdavis.edu:81/a/b/page?first=sam&last=king

Key/value pair passed to server handler for this particular resource

  • “?” starts the query string
  • “&” separates key/value pairs
  • “=” separates key from value

31 of 43

URL encoding

What if we want to pass the server a “&”

32 of 43

URL encoding

What if we want to pass the server a “&”

Valid chars: A-Z, a-z, 0-9, -_.~, all else must be %xx where xx is hex code for char

33 of 43

Taking a step back, ways to set message boundaries

  1. Delimiters (e.g., HTTP header key/value pairs)
    1. Have to make sure that you’re delimiter isn’t used

  • Specify the size (e.g., HTTP message body)

What are the tradeoffs between these?

34 of 43

TCP / IP: ordered, reliable byte streams

HTTP: request and response messages

URLs: Resources and global names for these resources

REST: functions operating on URLs

JSON: encoding and decoding for data structures

35 of 43

REST: methods that you can execute on resources

Hierarchical set of resources defined by URLs

  • Collections: a set of resources
  • Objects: a single resource

Main methods: Create, retrieve, update, and delete

Key: These are just guiding principles, servers can do whatever they want with each HTTP method

36 of 43

Retrieve: HTTP GET

Does not modify anything in the server, download stuff

Examples:

  • HTTP GET http://www.appdomain.com/users
  • HTTP GET http://www.appdomain.com/users?size=20&page=5
  • HTTP GET http://www.appdomain.com/users/123
  • HTTP GET http://www.appdomain.com/users/123/address

37 of 43

Create: HTTP POST

Creates a new resource, is not idempotent

(Almost) always includes a “message body” to define the new resource (more on this later!)

Examples:

  • HTTP POST http://www.appdomain.com/users
  • HTTP POST http://www.appdomain.com/users/123/accounts

38 of 43

Create / Update: HTTP PUT

Create or update a signal resource

Also usually includes a message body

Difference between POST and PUT?

  • POST is to a collection and PUT is for a specific resource

Examples:

  • HTTP PUT http://www.appdomain.com/users/123
  • HTTP PUT http://www.appdomain.com/users/123/accounts/456

39 of 43

Delete: HTTP DELETE

Used to delete a collection or a resource

Examples:

  • HTTP DELETE http://www.appdomain.com/users/123
  • HTTP DELETE http://www.appdomain.com/users/123/accounts/456

40 of 43

TCP / IP: ordered, reliable byte streams

HTTP: request and response messages

URLs: Resources and global names for these resources

REST: functions operating on URLs

JSON: encoding and decoding for data structures

41 of 43

JSON: how to send objects “over the wire”

Serialize: Take a C++ object and convert it to a string

Deserialize: Take a string and convert it to a C++ object

Use with HTTP methods for REST call data

42 of 43

Base elements: objects and array

{

“Customer_id”: 1234,

“Name”: “Honey”

}

[

1234, 5678

]

43 of 43

Values

Object

Array

String

Number

Boolean: “true” | “false”

Null

One omission: date, people tend to use utc iso datetime strings