1 of 39

Section 1: Intro to Labs 0 + 1

CSE 452 Spring 2024

2 of 39

Announcements

Student repos should be available at dslabs-[your CSE NETID], reach out to us if yours is missing
Lab 1, Lab 1 design doc, Pset 1 due next Wednesday (4/3) [done individually]
Review syllabus for late day policy

48 hr grace period for lab 1
48 hr grace period for Pset 1
NO grace period for design doc

Please fill out the CSE 452 partner / W credit form by Sunday 5pm (3/31)

3 of 39

Labs (in terms of difficulty from what students have told us in the past)

Difficulty

Lab 1

Lab 2

Lab 3

Lab 4 p1

Lab 4 p2

Lab 4 p3

4 of 39

Time for labs [in hours]

(fairly rough) statistics from Spring 2021

Lab	Mean Total (Union)	Std. dev. Total (Union)	Median Total (Union)
Lab 1	7.438	3.336	7.0
Lab 2	32.94 (28.09)	15.98 (14.17)	30 (25)
Lab 3	53.57 (37.96)	30.48 (24.76)	49 (32)
Lab 4*	37.24 (30.62)	21.78 (18.54)	30 (27)

Please report times as a single number (float or int) after the colon. Stats are self-reported, math is hard sometimes and people flipped unions and totals or had unions greater than their total time?

*= some people only did up to part 2

5 of 39

Time with design docs [in hours]

(fairly rough) statistics from Winter 2023

Lab	Mean Total (Union)	Std. dev. Total (Union)	Median Total (Union)
Lab 2 (Spring 2021)	32.94 (28.09)	15.98 (14.17)	30 (25)
Lab 2 (Winter 2023) Design doc* Coding	26.54 10.64 15.90	10.29 6.09 9.00	25 10 15

*Design docs were due a week before the lab submission.

Doing the design doc in advance of lab 2 saved on average ~6.5 hours of work! Qualitative feedback on design doc usefulness was also very positive.

This quarter, we streamlined the design docs to save student time and still provide the same benefit.

6 of 39

Lab Framework Introduction

7 of 39

Framework

Collection of abstract classes and interfaces that you will extend/implement
Read the documentation!

In /doc/dslabs/framework directory

Read the spec!

8 of 39

Node

Abstract class that you will subclass
Basic unit of computation (aka one machine)
Notable methods:

set(Timer timer, int timerLengthMillis)
send(Message message, Address to)
message and timer handlers (naming is important!)

Automatically triggers when an XXX message is received/XXX timer goes off

handleXXX where XXX is name of message (e.g. handlePongReply)
onXXX where XXX is name of timer (e.g. onPingTimer)
This is called reflection

Handlers should be

Deterministic
Idempotent whenever possible

9 of 39

Idempotency and Determinism

Determinism: Outcome is only a function of state and message/timer handlers

This means no timestamps, no random numbers, no UUIDs

Idempotent: When a unique message comes, the actions (e.g. changing state or creating timers) taken by it are only applied once and won’t be applied again for duplicate messages

Problems come up usually when people create new timers for a duplicate message, which expands the state space and makes exploration towards a goal harder

10 of 39

Types of Nodes:

Client

interface that client Nodes should implement

sendCommand(), hasResult(), getResult()

Called by testing framework

11 of 39

Types of Nodes:

Server

No interface for server nodes
Will generally call execute() on an Application

12 of 39

Application

Where the client wants its RPC (command) to get executed

Could be anything! Key-Value store, shopping cart, game, bank account information…

Data structures to store data
Logic for doing application-specific tasks
Processes Commands and produces Results

Each application will define its own set of Commands and Results

KVStore example: GETS/PUTS/APPENDS

Interface

Result execute(Command c)

13 of 39

Message

send(Message message, Address to)
Encapsulates data passed between Nodes
Messages can contain anything (like metadata, Commands, or Results)
Messages have no methods, but extends Serializable
Messages are serialized/deserialized for you

Objects are serialized (copied into a packet) when sent and deserialized (copied out of a packet) when received

14 of 39

Timer

set(Timer timer, int timerLengthMillis)
Triggers the timer handler when the timer fires.

If you have a timer called PingTimer, onPingTimer will be called after the timer has been set and the set duration has passed.

Will not be automatically reset! Need to set these manually in node.
Can contain anything, similar to Messages.

15 of 39

Synchronize

You will see many methods that are synchronized, ex:

public synchronized Result getResult();

Reason: Only one thread can execute a synchronized method on an object at a time (all other threads executing synchronized methods on that object will block until the first thread releases the lock)

We use method synchronization, so the entire object gets locked when a synchronized method gets called, but wait() releases the lock.

TL;DR: Use synchronized on all Client methods

Specifier?

Lock, object-wise

-------------------------------------

There is a race condition here: imagine that “while (pong == null)” is evaluated, but before executing wait, the pong is received, and then notify() is called WITH ZERO WAITERS, then you hit wait(), but you may have missed the one notify()! That is, you have

public Result getResult() throws InterruptedException {

while (pong == null) {

wait();

}

return pong;

}

/* -------------------------------------------------------------------------

Message Handlers

-----------------------------------------------------------------------*/

private void handlePongReply(PongReply m, Address sender) {

if (Objects.equal(ping.value(), m.pong().value())) {

pong = m.pong();

notify();

}

(Apparently this showed up in Java interviews like 10 years back.)

Solution:

public synchronized Result getResult() throws InterruptedException {

while (pong == null) {

wait();

}

return pong;

}

/* -------------------------------------------------------------------------

Message Handlers

-----------------------------------------------------------------------*/

private synchronized void handlePongReply(PongReply m, Address sender) {

if (Objects.equal(ping.value(), m.pong().value())) {

pong = m.pong();

notify();

}

You don’t actually need to understand much about synchronized, waits, and notify in the labs, but it might be helpful to think about concurrency. Also, synchronized, wait, and notify are kind of old java-wise, so it might not be the most useful thing to learn about.

More on synchronized here: http://tutorials.jenkov.com/java-concurrency/synchronized.html

More explanation on wait and notify here: https://howtodoinjava.com/java/multi-threading/wait-notify-and-notifyall-methods/

16 of 39

Wait & Notify

Think Condition Variable

while (!checkCondition()) wait();

Re-evaluate condition when notified - notify as a “hint”
Ex: PingClient.getResult() waits while (pong == null), where pong is set by handlePongReply!

17 of 39

Lombok

@EqualsAndHashCode

Generates equals and hashCode methods for you

@Data

“A shortcut for @ToString, @EqualsAndHashCode, @Getter on all fields, @Setter on all non-final fields, and @RequiredArgsConstructor!” - https://projectlombok.org/features/Data

All messages and timers should have @Data, will lead to an explosion in state space if you don’t and you may fail some tests

Very common bug in the labs

If you use Intellij, install the Lombok Plugin: https://projectlombok.org/setup/intellij

It’s really just for Boilerplate

18 of 39

Lab 0 Tour + Testing

19 of 39

Lab 0 Demo!

20 of 39

run-tests.py

./run-tests.py --lab 0 --debug 1 1 "Hello World,Goodbye World"
Options

--debug <# servers> <# clients> <comma-separated list of client arguments>: start the visual debugger with the given arguments
--test-num, --lab
--no-run, --no-search: execute only the runtime tests or the graph search tests
-g FINEST: logs every message

Use Python 3, Java 17 or higher :)

21 of 39

Type of Tests

Regular (Run)

Runs the Nodes, usually in parallel

You can use `./run-tests.py --single-threaded` to help debug
If it doesn’t work in single threaded, it probably won’t work when running in parallel

“UNRELIABLE” tests means that messages could get dropped

Search

Checks correctness and liveness
State search to look for state violations and/or a goal state (BFS, DFS)
Do not use logging. The logging messages won’t make any sense because of how states are explored, i.e. you might observe they go back on each other.

22 of 39

Common run commands

Running a single test

./run-tests.py -l LAB_NUM -n TEST_NUM

Running multiple tests

./run-tests.py -l LAB_NUM -n TEST_NUM_1,TEST_NUM_2,TEST_NUM_3

23 of 39

Common run commands

Running with Logging [prints out message receives/timer handles]

./run-tests.py -l LAB_NUM -n TEST_NUM -g FINER

Running with Logging [prints out message sends & receives/timer sets and handles]

./run-tests.py -l LAB_NUM -n TEST_NUM -g FINEST

Running with Logging and writing to a file (Don’t do this for search tests)

./run-tests.py -l LAB_NUM -n TEST_NUM -g FINEST &> output.txt

24 of 39

Common run commands -- [SEARCH] tests

Open up the visualizer on a non-search test with a custom workload

./run-tests.py -l LAB_NUM --debug NUM_SERVERS NUM_CLIENTS WORKLOAD

WORKLOAD will look like "PUT:foo:bar,APPEND:foo:baz,GET:foo"

Open up the visualizer on a failed search test �[ONLY STARTS VISUALIZER FOR FAILED SEARCH] �(You’ll want to click “Debug system with trace”)

./run-tests.py -l LAB_NUM -n SEARCH_TEST_NUM --start-viz

Equals and Hashcode/Idempotency checks (really stupid reasons for failing search tests)

./run-tests.py -l LAB_NUM -n SEARCH_TEST_NUM --checks

Note: --checks doesn’t work on run tests.

25 of 39

Saving SEARCH test traces

./run-tests.py -l LAB_NUM -n SEARCH_TEST_NUM --save-trace

Saves a copy of the trace for the search test if there’s an invariant violation
Helps with debugging some null pointer exceptions

./run-tests.py --replay-traces [TRACE_NAME...]

Runs the trace again to see if you get the same error

(You might need to manage your traces)

Works with --start-viz

26 of 39

General Lab Debugging

	Run	Search
Invariant Violations	-g FINER &> log.txt → might need to print out multiple runs if issue does not appear every time	Visual Debugger → retrace steps that led to the invariant violation
Liveness Violation or Timeout	-g FINER &> log.txt → look for patterns in log file	Visual Debugger → try to drive system towards goal (under the constraints of the test)

27 of 39

Lab 1

28 of 39

Recommendations/General Notes

Start early!
Ask questions in lecture or on Ed
Labs get much, much, much harder
Know what you’re trying to implement before coding (i.e. do the design docs!)
Read the spec and reread the spec
You may find yourself spending time to rewrite your code (to make it cleaner or to trim unnecessary implementation details) → that’s totally ok and expected

29 of 39

RPC - Remote Procedure Call

When a computer program causes �a procedure to execute in a different�address space (perhaps on another�computer)

30 of 39

RPC Semantics

How many times will an RPC be executed by the server before it returns to the invoker of the RPC?
At least once

Client retries

At most once

Server prevents non-unique requests from compound
UID (sequence numbers like 1,2,3,4)

[Don’t use UUID.randomUUID]

Idempotent

Exactly once

At most once with retries

31 of 39

Application vs. AMOApplication (Lab 1)

Wrapper around original Application

Makes sure requests executed At Most Once

32 of 39

Lab 1 suggestions

Do not store StringBuilder for the key or value in the KVStore map
Make sure that alreadyExecuted uses the right comparison operators
Do not use any static, mutable fields in your classes (constants are fine).
In Part 3, be sure to integrate the AMO package into SimpleClient and SimpleServer. In SimpleServer, make sure you are constructing an AMOApplication
Do not use CompleteableFuture
While the labs recommend IntelliJ, it isn’t necessary to use it. Feel free to use your favorite IDE/text editor.

There might be some weird saving issues with Intellij where it says it saved some change, but the version saved on the machine doesn’t actually reflect that change, so watch out for that. Try checking with git status/git diff.

If you have import errors on your IDE, add paths to the JARs provided

33 of 39

Lab 1 suggestions (continued)

Don’t use a Hashtable, use HashMap instead

(see framework docs for Node)

Read the framework docs under docs
Follow the structure for Lab 0
If you see effects from previous tests, make sure that you don’t have static variables.

34 of 39

Good Practice for Timers

If a Node resets a Timer and calls a method to set the same Timer, and this goes on for a while, it may lead to the testing framework halting and you might see the framework make no progress (which is bad).
To reduce the number of states, when a timer goes off:

If you want to reset the timer, make sure to call set with the same Timer object, instead of passing a new Timer object (assuming the state of the Timer does not change. If it does, you will need to construct a new one).
private synchronized void onYTimer(YTimer t) {

…

set(t, RETRY_MILLIS);

}

Recommended: Reset timers at the end of the method/if-statement so that the timer isn’t ticking while still processing the method

35 of 39

Gentle Reminders

Student repos should be available at dslabs-[your CSE NETID], reach out to us if yours is missing
Lab 1, PSet 1 due next Wednesday (4/3) [done individually]

Do make submit.tar.gz and submit the tar file to Gradescope (one per group in the future)

Problem sets: see Gradescope for due dates
Please fill out the cse 452 partner form by Sunday (3/31) @ 5pm

36 of 39

Submitting Solutions

Submit design document separately in gradescope (one per group)

For later labs, design doc due week before lab, but ok to update

Do make submit.tar.gz and submit the tar file to Gradescope (one per group)

37 of 39

To run the visualizer remotely on attu or CSE lab computer

Linux has a XWindow client installed, then you need a X11 server to get what the client says. (contributor: Michael W. 🦊):

Windows: https://www.howtogeek.com/261575/how-to-run-graphical-linux-desktop-applications-from-windows-10s-bash-shell/

MacOS: https://www.macworld.com/article/1134207/leopardx11.html

Linux: https://www.addictivetips.com/ubuntu-linux-tips/set-up-x11-forwarding-on-linux/

Once X11 server is set up and you are running the search tests with the visualizer on the lab computer, you can ssh into the lab computer and run with �$ google-chrome &�To open up a browser and you should be able to connect to localhost:3000 or something

PuTTy settings if you choose to use PuTTy

It’s kinda slow though. Usually better to run it on your own computer or in person in the lab

38 of 39

Tour of the Visualization Tool

./run-tests.py --lab 0 --debug 1 1 "hi,bye"

39 of 39

Pulling Updates

Updates will be distributed via git (do it now plz).

Do this once at the start of the quarter:

git remote add upstream git@gitlab.cs.washington.edu:cse452-24sp/dslabs/dslabs-handout.git

Do this to pull:

git fetch upstream�git merge upstream/main