ECS 150: HTTP, URLs, REST, and JSON
Sam King
Administrative
Project 4 out on Wednesday and due on June 2nd
Last time: Networking APIs
This time: HTTP and URLs
Next time: REST and JSON
Goal for today
Teach you the most common protocol
Help to understand the principles behind it in case you ever need to design / implement a different protocol
Problem: We didn’t know when to stop reading!
Why doesn’t this work?
We read in bytes and hoped for the best
Hypertext transfer protocol (HTTP)
Application-layer protocol for internet distributed systems
HTTP Goals:
You have a client, or user agent
The client sends requests to a server
HTTP request
The server responds
HTTP response
Note: this is a bit different than TCP
TCP has the notion of a connection
You can send multiple request/response messages in a single TCP connection, and you can even pipeline messages
You don’t even need to use TCP
HTTP Messages
Request-Line | Status-Line
*(message-header CRLF)
CRLF
[ message-body ]
From last time, HTTP request
GET /hello_world.html HTTP/1.1<CRLF>
Host: localhost:8080<CRLF>
User-Agent: GunrockClient/1.0<CRLF>
Accept: */*<CRLF>
<CRLF>
Request line includes a method, URI, and HTTP ver
Request-Line | Status-Line
*(message-header CRLF)
CRLF
[ message-body ]
GET /hello_world.html HTTP/1.1
Message headers are key/value pairs
Request-Line | Status-Line
*(message-header CRLF)
CRLF
[ message-body ]
GET /hello_world.html HTTP/1.1
Host: localhost:8080
User-Agent: GunrockClient/1.0
Accept: */*
The client tells the server that it’s done with the headers by sending a blank line
Request-Line | Status-Line
*(message-header CRLF)
CRLF
[ message-body ]
GET /hello_world.html HTTP/1.1
Host: localhost:8080
User-Agent: GunrockClient/1.0
Accept: */*
This particular request didn’t include the optional message body, but this is where it would go
Request-Line | Status-Line
*(message-header CRLF)
CRLF
[ message-body ]
GET /hello_world.html HTTP/1.1
Host: localhost:8080
User-Agent: GunrockClient/1.0
Accept: */*
From last time the HTTP response
HTTP/1.1 200 OK
Content-Length: 141
Content-Type: text/html
Server: Gunrock Web
<!DOCTYPE html>
<html>
<head>
<title>Hello World</title>
</head>
<body>
<p>Hello ECS 150</p>
</body>
</html>
Request line includes a method, URI, and HTTP ver
Request-Line | Status-Line
*(message-header CRLF)
CRLF
[ message-body ]
HTTP/1.1 200 OK
HTTP response also includes headers and lets the client know the size of the message body
Request-Line | Status-Line
*(message-header CRLF)
CRLF
[ message-body ]
HTTP/1.1 200 OK
Content-Length: 141
Content-Type: text/html
Server: Gunrock Web
The body includes the content of the file that we requested
Request-Line | Status-Line
*(message-header CRLF)
CRLF
[ message-body ]
<!DOCTYPE html>
<html>
<head>
<title>Hello World</title>
</head>
<body>
<p>Hello ECS 150</p>
</body>
</html>
Taking a step back, what are some other protocols that we could have used?
If there are so many ways we could have done it, why did HTTP become ubiquitous?
Other examples
Thrift and Protobuf: binary formats for server-to-server RPC
Standard internet app architecture
The Internet
Frontend / load balancer / TLS termination
HTTP + JSON body
Business logic
...
Core services
Thrift / TCP
Thrift / TCP
Lecture ended here
Administrative
Project 4 out by Monday, due June 7th @ 8am
Last time: HTTP
This time: URLs, REST, and JSON
Next time: JSON and APIs
Uniform resource locators (URLs), full URL
Uniform resource locators (URLs), host name
https://bob.cs.ucdavis.edu:81/a/b/page?first=sam&last=king
Name of the machine running the HTTP server
Uniform resource locators (URLs), port
https://bob.cs.ucdavis.edu:81/a/b/page?first=sam&last=king
The port that the HTTP server is listening on
Uniform resource locators (URLs), path
https://bob.cs.ucdavis.edu:81/a/b/page?first=sam&last=king
Uniform resource locators (URLs), query
https://bob.cs.ucdavis.edu:81/a/b/page?first=sam&last=king
Key/value pair passed to server handler for this particular resource
URL encoding
URL encoding
What if we want to pass the server a “&”
Valid chars: A-Z, a-z, 0-9, -_.~, all else must be %xx where xx is hex code for char
Taking a step back, ways to set message boundaries
What are the tradeoffs between these?
TCP / IP: ordered, reliable byte streams
HTTP: request and response messages
URLs: Resources and global names for these resources
REST: functions operating on URLs
JSON: encoding and decoding for data structures
REST: methods that you can execute on resources
Hierarchical set of resources defined by URLs
Main methods: Create, retrieve, update, and delete
Key: These are just guiding principles, servers can do whatever they want with each HTTP method
Retrieve: HTTP GET
Does not modify anything in the server, download stuff
Examples:
Create: HTTP POST
Creates a new resource, is not idempotent
(Almost) always includes a “message body” to define the new resource (more on this later!)
Examples:
Create / Update: HTTP PUT
Create or update a signal resource
Also usually includes a message body
Difference between POST and PUT?
Examples:
Delete: HTTP DELETE
Used to delete a collection or a resource
Examples:
TCP / IP: ordered, reliable byte streams
HTTP: request and response messages
URLs: Resources and global names for these resources
REST: functions operating on URLs
JSON: encoding and decoding for data structures
JSON: how to send objects “over the wire”
Serialize: Take a C++ object and convert it to a string
Deserialize: Take a string and convert it to a C++ object
Use with HTTP methods for REST call data
Base elements: objects and array
{
“Customer_id”: 1234,
“Name”: “Honey”
}
[
1234, 5678
]
Values
Object
Array
String
Number
Boolean: “true” | “false”
Null
One omission: date, people tend to use utc iso datetime strings