1 of 32

Section 1: Intro to Labs 0 + 1

CSE 452 Winter 2023

2 of 32

Acknowledgements

These dslab slides were developed as a collaborative effort among the TA’s for UW CSE 452/552 over a period of several years. We thank all who contributed.

Adnan Ahmad, Nick Anderson, Anirban (Anir) Biswas, Anirudh Canumalla, Riley Chang, Tianyi Cui, Colin Evans, Yael Goldin, Divye Jain, Kushal Jhunjhunwalla, Justin Johnson, Sarang Joshi, Lukas Joswiak, Arthur Liang, Jiuru Li, Boyan Li, CJ Lin, Wei Lin, Travis McGaha, Ellis Michael, Samantha Miller, David Porter, Raden (Roy) Pradana, Guramrit Singh, Luxi Wang, Andrew Wei, Zachariah Wolfe, Doug Woos, Lanhao Wu, Yuchong (Yvonna) Xiang, Shibin (Jack) Xu, Paul Yau, Dao Yi, Michael Yu, Sarah Yu, Aileen Zeng, Kaiyuan Zhang, Leiyi Zhang, Eddy Zhou

3 of 32

Announcements

  • Student repos should be available at dslabs-[your CSE NETID], reach out to us if yours is missing
  • Lab 1 due next Friday (1/13) [done individually]
  • Problem sets: see course website for due dates
  • Keep an eye out for partner form next week

4 of 32

Time for labs [in hours]

(fairly rough) statistics from Spring 2021

Lab

Mean

Total (Union)

Std. dev.

Total (Union)

Median

Total (Union)

Lab 1

7.438

3.336

7.0

Lab 2

32.94 (28.09)

15.98 (14.17)

30 (25)

Lab 3

53.57 (37.96)

30.48 (24.76)

49 (32)

Lab 4*

37.24 (30.62)

21.78 (18.54)

30 (27)

Please report times as a single number (float or int) after the colon. Stats are self-reported, math is hard sometimes and people flipped unions and totals or had unions greater than their total time?

*= some people only did up to part 2

5 of 32

Time for labs [in hours]�(very rough) statistics from Winter 2021

Lab

Mean

Std. dev.

Median

Lab 1

8.90

4.35

8

Lab 2

32.29

18.29

30

Lab 3

66.19

40.47

60

Lab 4*

46.59

29.76

40

Note: Putting a single number at the end of the WRITEUP.txt makes it easier to collect accurate statistics. It’s hard to have accurate statistics if people aren’t consistent with reporting them. Just put a combined time at the end if possible (a union of the time spent would be ideal).

Most people combined the time that they spent working together or people listed their times separately even though they spent the time working together, so the lows and highs are generally /2 or *2 approximations.

(group)

(group)

(group)

Conclusion: This class takes time. Most of it goes into debugging, so write code precisely and carefully. Time for labs vary depending on which bugs you run into.

Also take breaks as necessary to avoid burnout.

6 of 32

Labs (in terms of difficulty from what students have told us in the past)

Difficulty

Lab 1

Lab 2

Lab 3

Lab 4 p1

Lab 4 p2

Lab 4 p3

7 of 32

Recommendations/General Notes

  • Start early!
  • Ask questions in section/lecture/on ed
  • Labs get much much much harder
  • Know what you’re trying to implement before coding
  • Read the spec and reread the spec
  • New! In the past, we left several implementation details out for students because a lot of different things work. You can still figure them out yourself, but we added more hints, diagrams, and suggestions for implementation in case you get stuck and to alleviate suffering.
  • You may find yourself spending time to rewrite your code (to make it cleaner or to trim unnecessary implementation details) -> that’s totally ok and expected

8 of 32

Framework

  • Collection of abstract classes and interfaces that you will extend/implement
  • Read the documentation!
    • In /doc/dslabs/framework directory
  • Read the spec!

9 of 32

Node

  • Abstract class that you will subclass
  • Basic unit of computation (aka one machine)
  • Notable methods:
    • set(Timer timer, int timerLengthMillis)
    • send(Message message, Address to)
    • message and timer handlers (naming is important!)
      • Automatically triggers when an XXX message is received/XXX timer goes off
        • handleXXX where XXX is name of message (e.g. handlePongReply)
        • onXXX where XXX is name of timer (e.g. onPingTimer)
        • This is called reflection
  • Handlers should be
    • Deterministic
    • Idempotent whenever possible

Read the documentation!

10 of 32

Client/Server

  • Types of Nodes
  • Client
    • interface that client Nodes should implement
      • sendCommand(), hasResult(), getResult()
    • Called by testing framework
  • Server
    • No interface for server nodes
    • Will generally call execute() on an Application

11 of 32

Application

  • Backing data structures + logic used by nodes (i.e. key/value store, shopping cart, etc.)
  • Processes Commands and produces Results
    • Each application will define its own set of Commands and Results
    • Examples of commands are GETs and PUTs on a server
  • Interface
    • Result execute(Command c)

12 of 32

Message

  • send(Message message, Address to)
  • Encapsulates data passed between Nodes
  • Messages can contain Commands and Results
  • Interface with no methods, but extends Serializable
  • Messages are serialized/deserialized for you
    • Objects are serialized (copied into a packet) when sent and deserialized (copied out of a packet) when received

13 of 32

Timer

  • set(Timer timer, int timerLengthMillis)
  • Delivered to Nodes via their timeout handlers
  • Will not be automatically delivered! Need to set these manually in Nodes.
    • If a Node resets a Timer and calls a method to set the same Timer, and this goes on for a while, it may lead to the testing framework halting and you might see the framework make no progress (which is bad).
    • Additionally, to reduce the number of states, when a timer goes off: �if you want to reset the timer, make sure to call set with the same Timer object, instead of passing a new Timer object
  • Can contain other data and call methods
    • Recommended: Reset timers at the end of the method/if-statement so that the timer isn’t ticking while still processing the method

14 of 32

Synchronize

  • You will see many methods that are synchronized, ex:

public synchronized Result getResult();

  • Only one thread can execute a synchronized method on an object at a time (all other threads executing synchronized methods on that object will block until the first thread releases the lock)
    • We use method synchronization, so the entire object gets locked when a synchronized method gets called, but wait() releases the lock.
  • Consider PingClient with getResult() and handlePongReply() without the synchronized keyword

15 of 32

Wait & Notify

  • Think Condition Variable

while (!checkCondition()) wait();

  • Re-evaluate condition when notified - notify as a “hint”
  • Ex: PingClient.getResult() waits while (pong == null), where pong is set by handlePongReply!

16 of 32

Lombok

  • @EqualsAndHashCode
    • Generates equals and hashCode methods for you
  • @Data
    • “A shortcut for @ToString, @EqualsAndHashCode, @Getter on all fields, @Setter on all non-final fields, and @RequiredArgsConstructor!” - https://projectlombok.org/features/Data
  • All messages and timers should have @Data, will lead to an explosion in state space if you don’t and you may fail some tests

If you use Intellij, install the Lombok Plugin: https://projectlombok.org/setup/intellij

It’s really just for Boilerplate

17 of 32

run-tests.py

  • ./run-tests.py --lab 0 --debug 1 1 "Hello World,Goodbye World"
  • Options
    • --debug <# servers> <# clients> <comma-separated list of client arguments>: start the visual debugger with the given arguments
    • --test-num, --lab
    • --no-run, --no-search: execute only the runtime tests or the graph search tests
    • -g FINEST: logs every message
  • Use Python 3, Java 17 :)

18 of 32

Submitting Solutions

Submit design document separately in gradescope (one per group)

For later labs, design doc due week before lab, but ok to update

Do make submit.tar.gz and submit the tar file to Gradescope (one per group)

Keep an eye out for the partner form next week!

19 of 32

To run the visualizer remotely on attu or CSE lab computer

Linux has a XWindow client installed, then you need a X11 server to get what the client says. (contributor: Michael W. 🦊):

Windows: https://www.howtogeek.com/261575/how-to-run-graphical-linux-desktop-applications-from-windows-10s-bash-shell/

MacOS: https://www.macworld.com/article/1134207/leopardx11.html

Linux: https://www.addictivetips.com/ubuntu-linux-tips/set-up-x11-forwarding-on-linux/

Once X11 server is set up and you are running the search tests with the visualizer on the lab computer, you can ssh into the lab computer and run with �$ google-chrome &�To open up a browser and you should be able to connect to localhost:3000 or something

PuTTy settings if you choose to use PuTTy

It’s kinda slow though. Usually better to run it on your own computer or in person in the lab

20 of 32

Type of Tests

Regular (Run)

  • Runs the Nodes, usually in parallel
    • You can use `./run-tests.py --single-threaded` to help debug
    • If it doesn’t work in single threaded, it probably won’t work when running in parallel
  • “UNRELIABLE” tests means that messages could get dropped

Search

  • Checks correctness and liveness
  • State search to look for state violations and/or a goal state (BFS, DFS)
  • Do not use logging. The logging messages won’t make any sense because of how states are explored, i.e. you might observe they go back on each other.

21 of 32

Common run commands

Running a single test

  • ./run-tests.py -l LAB_NUM -n TEST_NUM

Running multiple tests

  • ./run-tests.py -l LAB_NUM -n TEST_NUM_1,TEST_NUM_2,TEST_NUM_3

22 of 32

Common run commands

Running with Logging [prints out message receives/timer handles]

  • ./run-tests.py -l LAB_NUM -n TEST_NUM -g FINER

Running with Logging [prints out message sends & receives/timer sets and handles]

  • ./run-tests.py -l LAB_NUM -n TEST_NUM -g FINEST

Running with Logging and writing to a file (Don’t do this for search tests)

  • ./run-tests.py -l LAB_NUM -n TEST_NUM -g FINEST &> output.txt

23 of 32

Common run commands -- [SEARCH] tests

Open up the visualizer on a non-search test with a custom workload

  • ./run-tests.py -l LAB_NUM --debug NUM_SERVERS NUM_CLIENTS WORKLOAD
    • WORKLOAD will look like "PUT:foo:bar,APPEND:foo:baz,GET:foo"

Open up the visualizer on a failed search test �[ONLY STARTS VISUALIZER FOR FAILED SEARCH] �(You’ll want to click “Debug system with trace”)

  • ./run-tests.py -l LAB_NUM -n SEARCH_TEST_NUM --start-viz

Equals and Hashcode/Idempotency checks (really stupid reasons for failing search tests)

  • ./run-tests.py -l LAB_NUM -n SEARCH_TEST_NUM --checks

Note: --checks doesn’t work on run tests.

24 of 32

Saving SEARCH test traces

./run-tests.py -l LAB_NUM -n SEARCH_TEST_NUM --save-trace

  • Saves a copy of the trace for the search test if there’s an invariant violation
  • Helps with debugging some null pointer exceptions

./run-tests.py --check-trace

  • Runs the trace again to see if you get the same error
    • (You might need to manage your traces)
  • Works with --start-viz

25 of 32

Tour of the Visualization Tool

./run-tests.py --lab 0 --debug 1 1 "hi,bye"

26 of 32

RPC - Remote Procedure Call

When a computer program causes �a procedure to execute in a different�address space (perhaps on another�computer)

27 of 32

RPC Semantics

  • How many times will an RPC be executed by the server before it returns to the invoker of the RPC?
  • At least once
    • Retry
  • At most once
    • UID (sequence numbers like 1,2,3,4)
      • [Don’t use UUID.randomUUID]
    • Idempotent
  • Exactly once
    • At most once with retries

28 of 32

Idempotency and Determinism

Idempotent: When a unique message comes, the actions (e.g. changing state or creating timers) taken by it are only applied once and won’t be applied again for duplicate messages

  • Problems come up usually when people create new timers for a duplicate message, which expands the state space and makes exploration towards a goal harder

Determinism: Outcome is only a function of state and message/timer handlers

  • This means no timestamps, no random numbers, no UUIDs

29 of 32

Pulling Updates

Updates will be distributed via git (do it now plz).

Do this once at the start of the quarter:

git remote add upstream git@gitlab.cs.washington.edu:cse452-23wi/dslabs/dslabs-handout.git

Do this to pull:

git fetch upstream�git merge upstream/main

30 of 32

Lab 1 help/hints

  • Do not store StringBuilder for the key or value in the KVStore map
  • Make sure that alreadyExecuted uses the right comparison operators
  • “Do not use any static, mutable fields in your classes (constants are fine). Your nodes (i.e., instances of the Node classes) should communicate only through message passing”
    • E.g. the Map in KVStore
  • In Part 3, be sure to integrate the AMO package into SimpleClient and SimpleServer. In SimpleServer, make sure you are constructing an AMOApplication
  • Do not use CompleteableFuture
  • While the labs recommend IntelliJ, it isn’t necessary to use it. Feel free to use your favorite IDE/text editor. (vim)
    • There might be some weird saving issues with Intellij where it says it saved some change, but the version saved on the machine doesn’t actually reflect that change, so watch out for that. Try checking with git status/git diff.
  • If you have import errors on your IDE, add paths to the JARs provided

31 of 32

Lab 1 more help/hints

  • Don’t use a Hashtable (see framework docs for Node)
  • Read the framework docs under docs
  • You can follow the structure for Lab 0
  • If you see effects from previous tests, make sure that you don’t have static variables.

32 of 32

General Lab Debugging

Run

Search

Invariant Violations

-g FINER &> log.txt → might need to print out multiple runs if issue does not appear every time

Visual Debugger → retrace steps that led to the invariant violation

Liveness Violation or Timeout

-g FINER &> log.txt → look for patterns in log file

Visual Debugger → try to drive system towards goal (under the constraints of the test)