1 of 32

Section 1: Intro to Labs 0 + 1

CSE 452 Winter 2023

2 of 32

Acknowledgements

These dslab slides were developed as a collaborative effort among the TA’s for UW CSE 452/552 over a period of several years. We thank all who contributed.

Adnan Ahmad, Nick Anderson, Anirban (Anir) Biswas, Anirudh Canumalla, Riley Chang, Tianyi Cui, Colin Evans, Yael Goldin, Divye Jain, Kushal Jhunjhunwalla, Justin Johnson, Sarang Joshi, Lukas Joswiak, Arthur Liang, Jiuru Li, Boyan Li, CJ Lin, Wei Lin, Travis McGaha, Ellis Michael, Samantha Miller, David Porter, Raden (Roy) Pradana, Guramrit Singh, Luxi Wang, Andrew Wei, Zachariah Wolfe, Doug Woos, Lanhao Wu, Yuchong (Yvonna) Xiang, Shibin (Jack) Xu, Paul Yau, Dao Yi, Michael Yu, Sarah Yu, Aileen Zeng, Kaiyuan Zhang, Leiyi Zhang, Eddy Zhou

3 of 32

Announcements

Student repos should be available at dslabs-[your CSE NETID], reach out to us if yours is missing
Lab 1 due next Friday (1/13) [done individually]
Problem sets: see course website for due dates
Keep an eye out for partner form next week

4 of 32

Time for labs [in hours]

(fairly rough) statistics from Spring 2021

Lab	Mean Total (Union)	Std. dev. Total (Union)	Median Total (Union)
Lab 1	7.438	3.336	7.0
Lab 2	32.94 (28.09)	15.98 (14.17)	30 (25)
Lab 3	53.57 (37.96)	30.48 (24.76)	49 (32)
Lab 4*	37.24 (30.62)	21.78 (18.54)	30 (27)

Please report times as a single number (float or int) after the colon. Stats are self-reported, math is hard sometimes and people flipped unions and totals or had unions greater than their total time?

*= some people only did up to part 2

5 of 32

Time for labs [in hours]�(very rough) statistics from Winter 2021

Lab	Mean	Std. dev.	Median
Lab 1	8.90	4.35	8
Lab 2	32.29	18.29	30
Lab 3	66.19	40.47	60
Lab 4*	46.59	29.76	40

Note: Putting a single number at the end of the WRITEUP.txt makes it easier to collect accurate statistics. It’s hard to have accurate statistics if people aren’t consistent with reporting them. Just put a combined time at the end if possible (a union of the time spent would be ideal).

Most people combined the time that they spent working together or people listed their times separately even though they spent the time working together, so the lows and highs are generally /2 or *2 approximations.

(group)

Conclusion: This class takes time. Most of it goes into debugging, so write code precisely and carefully. Time for labs vary depending on which bugs you run into.

Also take breaks as necessary to avoid burnout.

6 of 32

Labs (in terms of difficulty from what students have told us in the past)

Difficulty

Lab 1

Lab 2

Lab 3

Lab 4 p1

Lab 4 p2

Lab 4 p3

7 of 32

Recommendations/General Notes

Start early!
Ask questions in section/lecture/on ed
Labs get much much much harder
Know what you’re trying to implement before coding
Read the spec and reread the spec
New! In the past, we left several implementation details out for students because a lot of different things work. You can still figure them out yourself, but we added more hints, diagrams, and suggestions for implementation in case you get stuck and to alleviate suffering.
You may find yourself spending time to rewrite your code (to make it cleaner or to trim unnecessary implementation details) -> that’s totally ok and expected

8 of 32

Framework

Collection of abstract classes and interfaces that you will extend/implement
Read the documentation!

In /doc/dslabs/framework directory

Read the spec!

9 of 32

Node

Abstract class that you will subclass
Basic unit of computation (aka one machine)
Notable methods:

set(Timer timer, int timerLengthMillis)
send(Message message, Address to)
message and timer handlers (naming is important!)

Automatically triggers when an XXX message is received/XXX timer goes off

handleXXX where XXX is name of message (e.g. handlePongReply)
onXXX where XXX is name of timer (e.g. onPingTimer)
This is called reflection

Handlers should be

Deterministic
Idempotent whenever possible

Read the documentation!

10 of 32

Client/Server

Types of Nodes
Client

interface that client Nodes should implement

sendCommand(), hasResult(), getResult()

Called by testing framework

Server

No interface for server nodes
Will generally call execute() on an Application

11 of 32

Application

Backing data structures + logic used by nodes (i.e. key/value store, shopping cart, etc.)
Processes Commands and produces Results

Each application will define its own set of Commands and Results
Examples of commands are GETs and PUTs on a server

Interface

Result execute(Command c)

12 of 32

Message

send(Message message, Address to)
Encapsulates data passed between Nodes
Messages can contain Commands and Results
Interface with no methods, but extends Serializable
Messages are serialized/deserialized for you

Objects are serialized (copied into a packet) when sent and deserialized (copied out of a packet) when received

13 of 32

Timer

set(Timer timer, int timerLengthMillis)
Delivered to Nodes via their timeout handlers
Will not be automatically delivered! Need to set these manually in Nodes.

If a Node resets a Timer and calls a method to set the same Timer, and this goes on for a while, it may lead to the testing framework halting and you might see the framework make no progress (which is bad).
Additionally, to reduce the number of states, when a timer goes off: �if you want to reset the timer, make sure to call set with the same Timer object, instead of passing a new Timer object

Can contain other data and call methods

Recommended: Reset timers at the end of the method/if-statement so that the timer isn’t ticking while still processing the method

14 of 32

Synchronize

You will see many methods that are synchronized, ex:

public synchronized Result getResult();

Only one thread can execute a synchronized method on an object at a time (all other threads executing synchronized methods on that object will block until the first thread releases the lock)

We use method synchronization, so the entire object gets locked when a synchronized method gets called, but wait() releases the lock.

Consider PingClient with getResult() and handlePongReply() without the synchronized keyword

Specifier?

Lock, object-wise

-------------------------------------

There is a race condition here: imagine that “while (pong == null)” is evaluated, but before executing wait, the pong is received, and then notify() is called WITH ZERO WAITERS, then you hit wait(), but you may have missed the one notify()! That is, you have

public Result getResult() throws InterruptedException {

while (pong == null) {

wait();

}

return pong;

}

/* -------------------------------------------------------------------------

Message Handlers

-----------------------------------------------------------------------*/

private void handlePongReply(PongReply m, Address sender) {

if (Objects.equal(ping.value(), m.pong().value())) {

pong = m.pong();

notify();

}

(Apparently this showed up in Java interviews like 10 years back.)

Solution:

public synchronized Result getResult() throws InterruptedException {

while (pong == null) {

wait();

}

return pong;

}

/* -------------------------------------------------------------------------

Message Handlers

-----------------------------------------------------------------------*/

private synchronized void handlePongReply(PongReply m, Address sender) {

if (Objects.equal(ping.value(), m.pong().value())) {

pong = m.pong();

notify();

}

You don’t actually need to understand much about synchronized, waits, and notify in the labs, but it might be helpful to think about concurrency. Also, synchronized, wait, and notify are kind of old java-wise, so it might not be the most useful thing to learn about.

More on synchronized here: http://tutorials.jenkov.com/java-concurrency/synchronized.html

More explanation on wait and notify here: https://howtodoinjava.com/java/multi-threading/wait-notify-and-notifyall-methods/

15 of 32

Wait & Notify

Think Condition Variable

while (!checkCondition()) wait();

Re-evaluate condition when notified - notify as a “hint”
Ex: PingClient.getResult() waits while (pong == null), where pong is set by handlePongReply!

16 of 32

Lombok

@EqualsAndHashCode

Generates equals and hashCode methods for you

@Data

“A shortcut for @ToString, @EqualsAndHashCode, @Getter on all fields, @Setter on all non-final fields, and @RequiredArgsConstructor!” - https://projectlombok.org/features/Data

All messages and timers should have @Data, will lead to an explosion in state space if you don’t and you may fail some tests

If you use Intellij, install the Lombok Plugin: https://projectlombok.org/setup/intellij

It’s really just for Boilerplate

17 of 32

run-tests.py

./run-tests.py --lab 0 --debug 1 1 "Hello World,Goodbye World"
Options

--debug <# servers> <# clients> <comma-separated list of client arguments>: start the visual debugger with the given arguments
--test-num, --lab
--no-run, --no-search: execute only the runtime tests or the graph search tests
-g FINEST: logs every message

Use Python 3, Java 17 :)

18 of 32

Submitting Solutions

Submit design document separately in gradescope (one per group)

For later labs, design doc due week before lab, but ok to update

Do make submit.tar.gz and submit the tar file to Gradescope (one per group)

Keep an eye out for the partner form next week!

19 of 32

To run the visualizer remotely on attu or CSE lab computer

Linux has a XWindow client installed, then you need a X11 server to get what the client says. (contributor: Michael W. 🦊):

Windows: https://www.howtogeek.com/261575/how-to-run-graphical-linux-desktop-applications-from-windows-10s-bash-shell/

MacOS: https://www.macworld.com/article/1134207/leopardx11.html

Linux: https://www.addictivetips.com/ubuntu-linux-tips/set-up-x11-forwarding-on-linux/

Once X11 server is set up and you are running the search tests with the visualizer on the lab computer, you can ssh into the lab computer and run with �$ google-chrome &�To open up a browser and you should be able to connect to localhost:3000 or something

PuTTy settings if you choose to use PuTTy

It’s kinda slow though. Usually better to run it on your own computer or in person in the lab

20 of 32

Type of Tests

Regular (Run)

Runs the Nodes, usually in parallel

You can use `./run-tests.py --single-threaded` to help debug
If it doesn’t work in single threaded, it probably won’t work when running in parallel

“UNRELIABLE” tests means that messages could get dropped

Search

Checks correctness and liveness
State search to look for state violations and/or a goal state (BFS, DFS)
Do not use logging. The logging messages won’t make any sense because of how states are explored, i.e. you might observe they go back on each other.

21 of 32

Common run commands

Running a single test

./run-tests.py -l LAB_NUM -n TEST_NUM

Running multiple tests

./run-tests.py -l LAB_NUM -n TEST_NUM_1,TEST_NUM_2,TEST_NUM_3

22 of 32

Common run commands

Running with Logging [prints out message receives/timer handles]

./run-tests.py -l LAB_NUM -n TEST_NUM -g FINER

Running with Logging [prints out message sends & receives/timer sets and handles]

./run-tests.py -l LAB_NUM -n TEST_NUM -g FINEST

Running with Logging and writing to a file (Don’t do this for search tests)

./run-tests.py -l LAB_NUM -n TEST_NUM -g FINEST &> output.txt

23 of 32

Common run commands -- [SEARCH] tests

Open up the visualizer on a non-search test with a custom workload

./run-tests.py -l LAB_NUM --debug NUM_SERVERS NUM_CLIENTS WORKLOAD

WORKLOAD will look like "PUT:foo:bar,APPEND:foo:baz,GET:foo"

Open up the visualizer on a failed search test �[ONLY STARTS VISUALIZER FOR FAILED SEARCH] �(You’ll want to click “Debug system with trace”)

./run-tests.py -l LAB_NUM -n SEARCH_TEST_NUM --start-viz

Equals and Hashcode/Idempotency checks (really stupid reasons for failing search tests)

./run-tests.py -l LAB_NUM -n SEARCH_TEST_NUM --checks

Note: --checks doesn’t work on run tests.

24 of 32

Saving SEARCH test traces

./run-tests.py -l LAB_NUM -n SEARCH_TEST_NUM --save-trace

Saves a copy of the trace for the search test if there’s an invariant violation
Helps with debugging some null pointer exceptions

./run-tests.py --check-trace

Runs the trace again to see if you get the same error

(You might need to manage your traces)

Works with --start-viz

25 of 32

Tour of the Visualization Tool

./run-tests.py --lab 0 --debug 1 1 "hi,bye"

26 of 32

RPC - Remote Procedure Call

When a computer program causes �a procedure to execute in a different�address space (perhaps on another�computer)

27 of 32

RPC Semantics

How many times will an RPC be executed by the server before it returns to the invoker of the RPC?
At least once

Retry

At most once

UID (sequence numbers like 1,2,3,4)

[Don’t use UUID.randomUUID]

Idempotent

Exactly once

At most once with retries

28 of 32

Idempotency and Determinism

Idempotent: When a unique message comes, the actions (e.g. changing state or creating timers) taken by it are only applied once and won’t be applied again for duplicate messages

Problems come up usually when people create new timers for a duplicate message, which expands the state space and makes exploration towards a goal harder

Determinism: Outcome is only a function of state and message/timer handlers

This means no timestamps, no random numbers, no UUIDs

29 of 32

Pulling Updates

Updates will be distributed via git (do it now plz).

Do this once at the start of the quarter:

git remote add upstream git@gitlab.cs.washington.edu:cse452-23wi/dslabs/dslabs-handout.git

Do this to pull:

git fetch upstream�git merge upstream/main

30 of 32

Lab 1 help/hints

Do not store StringBuilder for the key or value in the KVStore map
Make sure that alreadyExecuted uses the right comparison operators
“Do not use any static, mutable fields in your classes (constants are fine). Your nodes (i.e., instances of the Node classes) should communicate only through message passing”

E.g. the Map in KVStore

In Part 3, be sure to integrate the AMO package into SimpleClient and SimpleServer. In SimpleServer, make sure you are constructing an AMOApplication
Do not use CompleteableFuture
While the labs recommend IntelliJ, it isn’t necessary to use it. Feel free to use your favorite IDE/text editor. (vim)

There might be some weird saving issues with Intellij where it says it saved some change, but the version saved on the machine doesn’t actually reflect that change, so watch out for that. Try checking with git status/git diff.

If you have import errors on your IDE, add paths to the JARs provided

31 of 32

Lab 1 more help/hints

Don’t use a Hashtable (see framework docs for Node)
Read the framework docs under docs
You can follow the structure for Lab 0
If you see effects from previous tests, make sure that you don’t have static variables.

32 of 32

General Lab Debugging

	Run	Search
Invariant Violations	-g FINER &> log.txt → might need to print out multiple runs if issue does not appear every time	Visual Debugger → retrace steps that led to the invariant violation
Liveness Violation or Timeout	-g FINER &> log.txt → look for patterns in log file	Visual Debugger → try to drive system towards goal (under the constraints of the test)