CSE 486/586 Distributed Systems Programming Assignment 4
Replicated Key-Value Storage
Update
- All replicas should store the same value for each key. This is “per-key” consistency. There is no consistency guarantee you need to provide across keys. More formally, you need to implement per-key linearizability.
- If you implement versioning and quorums correctly, it will give you the per-key linearizability.
- Using versioning and quorum, you can make sure that all replicas store the same version. In the testing scenario for this assignment, there should not be any quorum failure and all replicas should store the same version all the time. By doing quorums and checking versions, you can always check that your implementation is correct.
Introduction
At this point, most of you are probably ready to understand and implement a Dynamo-style key-value storage; this assignment is about implementing a simplified version of Dynamo. (And you might argue that it’s not Dynamo any more ;-) There are three main pieces you need to implement: 1) Partitioning, 2) Replication, and 3) Failure handling.
This document assumes that you are already familiar with Dynamo. If you are not, that is your first step. There are many similarities between this assignment and the previous assignment for the most basic functionalities, and you are free to reuse your code from the previous assignment.
References
Before we discuss the requirements of this assignment, here are two references for the Dynamo design:
- Lecture slides on Dynamo: http://www.cse.buffalo.edu/~stevko/courses/cse486/spring13/lectures/29-dynamo.pptx
- Dynamo paper: http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf
The lecture slides give an overview, but do not discuss Dynamo in detail, so it should be a good reference to get an overall idea. The paper presents the detail, so it should be a good reference for actual implementation.
Note
It is important to remember that this assignment does not require you to implement everything about Dynamo. Mainly, the following are the things you do not need to consider from the Dynamo paper.
- Virtual nodes (mainly Section 4.2, but any discussion about virtual nodes in the paper in general): We do not use virtual nodes when creating partitions. Just like the previous assignment, all nodes are physical.
- Partitioned network (any discussion about it in the paper): We do not deal with network partitions in this assignment (although we do consider individual failures).
- Data versioning (Section 4.4): We do not need to consider the full data versioning described in the paper; but we do need to have a version number associated with each object in order to handle failures. See “Step 2” for more details.
- Hinted handoff (Section 4.6): We do not consider hinted handoff.
- Adding/removing nodes (Section 4.9): We assume that the global membership is static; individual failures may occur, but only temporarily.
We will discuss this more in “Step 2: Writing a Content Provider” below.
Step 0: Importing the project template
Just like the previous assignment, we have a project template you can import to Eclipse.
- Download the project template zip file to a directory.
- Import it to your Eclipse workspace.
- Open Eclipse.
- Go to “File” -> “Import”
- Select “General/Existing Projects into Workspace” (Caution: this is not “Android/Existing Android Code into Workspace”).
- In the next screen (which should be “Import Projects”), do the following:
- Choose “Select archive file:” and select the project template zip file that you downloaded.
- Click “Finish.”
- At this point, the project template should have been imported to your workspace.
- You might get an error saying “Android requires compiler compliance level...” If so, right click on “SimpleDynamo” from the Package Explorer, choose “Android Tools” -> “Fix Project Properties” which will fix the error.
- You might also get an error about android-support-v4.jar. If so, right click on “SimpleDynamo” from the Package Explorer, choose “Properties” -> “Java Build Path” -> “Libraries” and either fix the android-support-v4.jar’s path or replace it with your SDK’s correct android-support-v4.jar. (Courtesy of Justin and Sileem).
- Try running it on an AVD and verify that it’s working.
- Use the project template for implementing all the components for this assignment.
- The template has the package name of “edu.buffalo.cse.cse486586.simpledynamo“. Please do not change this.
- The template also defines a content provider authority and class. Please use it to implement your Dynamo functionalities.
- We will use SHA-1 as our hash function to generate keys just as last time.
Step 1: Writing the Main Activity
Your app should have an activity used for testing. It should have five buttons, three buttons that displays “Put1”, “Put2”, and “Put3”, one button that displays “Get”, and the last button that displays “LDump.” All the buttons are already provided in the template. Here are the requirements for the buttons.
- Put* buttons
- All Put* buttons should operate the same way except that they insert different values with the same set of keys.
- When touched, it should insert 20 <key, value> pairs into your storage by using your content provider’s insert().
- Each operation should be preceded by a 1-second delay.
- Each touch of the button resets the sequence number to 0.
- The format of the <key, value> pairs is the following:
- Key: the sequence number represented as a string starting from 0 (i.e., “0”, “1”, “2”, …, “19”.)
- Value: string for each button concatenated by each sequence number.
- For the button “Put1”, the values should be “Put10”, “Put11”, “Put12”, etc.
- For the button “Put2”, the values should be “Put20”, “Put21”, “Put22”, etc.
- For the button “Put3”, the values should be “Put30”, “Put31”, “Put32”, etc.
- Get
- This button should retrieve twenty keys, “0”, “1”, …, “19” and their corresponding values using your content provider’s query() interface. Again, each query should be preceded by a 1-second delay.
- Each retrieval should display the <key, value> pair retrieved on the screen. The format of the display is “<key, value>” as a string.
- LDump
- When clicked, this button should dump all the <key, value> pairs stored in your local storage. Since you need to implement a distributed key-value storage based on Dynamo, each instance stores <key, value> pairs that belong to one partition as well as replicas from other partitions. This button should display all local <key, value> pairs on the screen.
- The order of <key, value> pairs you display does not matter as long as it shows all locally stored <key, value> pairs.
- Note that each local storage in this assignment should contain replicas as well. Obviously, the LDump button should also display these replicas when clicked.
Step 2: Writing the Content Provider
Along with the main activity, your app should have a content provider. Just like the previous assignment, this content provider should implement all storage functionalities. For example, it should create server and client threads (if this is what you decide to implement), open sockets, and respond to incoming requests. The following are the requirements for your content provider:
- Membership
- Just as the original Dynamo, every node should know every other node. This means that each node knows all other nodes in the system and also knows exactly which partition belongs to which node; any node can forward a request to the correct node without using a ring-based routing.
- Unlike the original Dynamo, you can assume that there are always 3 nodes in the system. There is no need to implement adding/removing nodes from the system.
- However, there can be at most 1 node failure at any given time. We will discuss this in detail later.
- Each content provider instance should have a node id derived from its emulator port. This node id should be obtained by applying the above hash function (i.e., genHash()) to the emulator port. For example, the node id of the content provider instance running on emulator-5554 should be, node_id = genHash(“5554”). This is necessary to find the correct position of each node in the Dynamo ring.
- Request routing
- Unlike Chord, each Dynamo node knows all other nodes in the system and also knows exactly which partition belongs to which node.
- Under no failures, all requests should be directly forwarded to the coordinator, and the coordinator should be in charge of serving read/write operations.
- Quorum replication
- The replication degree N should be 3. This means that given a key, the key’s coordinator as well as the 2 successor nodes in the Dynamo ring should store the key.
- Both the reader quorum size R and the writer quorum size W should be 2.
- The coordinator for a get/put request should always contact other two nodes and get the votes. For each request, the coordinator should return to the requester whether the request was successful or not. If the coordinator fails to obtain the minimum number of votes, it should return an error.
- For write operations, all objects should be versioned in order to distinguish stale copies from the most recent copy.
- For read operations, if the readers in the reader quorum have different versions of the same object, the coordinator should pick the most recent version and return it.
- Failure handling
- Warning: this part is the main difficulty of this assignment. Handling failures should be done very carefully because there can be many corner cases to consider and cover.
- At any point of time, there can be at most one node failure. We will emulate a failure only by force closing an app instance. We will not emulate a failure by killing an entire emulator instance. You can force close an app by going to “Settings” -> “Applications”, then select the app. There’s a “force close” button.
- You can assume that a node fails only when there is no outstanding request. Although this is an unrealistic assumption, it greatly simplifies the complexity of your implementation. Handling a node failure when there’s an outstanding read/write request is significantly harder.
- All failures are temporary; you need to assume that a failed node will recover soon. When a node recovers, it should copy all the object writes it missed during the failure. This can be done by asking the right nodes and copy from them. Again, you can assume that there is no outstanding request during a node recovery.
- Please focus on correctness rather than performance. Once you handle failures correctly, if you still have time, you can improve your performance.
- There is no need to implement a failure detector; just as the original Dynamo, each request should be used to detect a node failure.
- For this purpose, you can use a timeout for a socket read; you can pick a reasonable timeout value, e.g., 100 ms, and if a node does not respond within the timeout, you can consider it a failure.
- Do not rely on socket creation or connect status to determine if a node has failed. Due to the Android emulator networking setup, it is not safe to rely on socket creation or connect status to judge node failures. Please use an explicit method to test whether an app instance is running or not, e.g., using a socket read timeout as described above.
- Just as the original Dynamo, when a coordinator for a request fails and it does not respond to the request, its successor should be contacted next for the request.
- There is no need to implement hinted handoff; during a write request, if there is a failed node, your implementation should simply skip the failed node for the write operation.
- Your content provider’s URI should be “content://edu.buffalo.cse.cse486586.simpledynamo.provider”, which means that any app should be able to access your content provider using that URI. This is already defined in the template, so please don’t change this. Your content provider does not need to match/support any other URI pattern (though for your “LDump” button, you might want to match something like “content://edu.buffalo.cse.cse486586.simpledynamo.provider/*”, but this is not required and up to you).
- We have fixed the ports & sockets.
- Your app should open one server socket that listens on 10000.
- You need to use run_avd.py and set_redir.py to set up the testing environment.
- The grading will use 3 AVDs. The redirection ports are 11108, 11112, and 11116.
- You should just hard-code the above 3 ports and use them to set up connections.
- Please use the code snippet provided in the PA1 description on how to determine your local AVD.
- avd0: “5554”
- avd1: “5556”
- avd2: “5558”
- Any app (not just your app) should be able to access (read and write) your content provider. As with the previous assignment, please do not include any permission to access your content provider. You can test this with DynamoTester.apk.
- Note that your content provider should only store the <key, value> pairs that belong to itself, i.e., its own partition and replicated <key, value> pairs.
Design Document
You need to write a design document of up to 2 pages (12 pt font in .pdf). This should include:
- What components you designed and what they do
- A discussion of what consistency model your implementation achieves under the assumptions of this assignment. The way you handle failures and how your system behaves under failures greatly affects what consistency you can provide. Please carefully write this part and discuss what consistency model your implementation can achieve.
Submission
We use the CSE submit script. You need to use either “submit_cse486” or “submit_cse586”, depending on your registration status. If you haven’t used it, the instructions on how to use it is here: https://wiki.cse.buffalo.edu/services/content/submit-script
One again, you need to submit three separate files described below. You must follow everything below exactly. Otherwise, you will get no point on this assignment.
- Your app’s .apk: The name should be SimpleDynamo.apk.
- Your design document in .pdf: The name should be SimpleDynamo.pdf. Please do not submit a .docx or .txt file.
- Your entire Eclipse project source code tree zipped up in .zip: The name should be SimpleDynamo.zip. To do this, please go to your Eclipse workspace directory, find your app, and zip it. Please do not use any other compression tool other than zip, i.e., no 7-Zip, no RAR, etc. Please make sure to zip the top-level directory that contains all the source, i.e., unzipping should create one “SimpleDynamo” directory that contains all the source files.
Deadline: 4/26/13 (Friday) 2:59pm
This is right before class @ 3pm. The deadline is firm; if your timestamp is 3pm, it is a late submission.
Grading
This assignment is 15% of your final grade. The breakdown for this assignment is:
- 1% if the put, get, and dump buttons work correctly (apart from the underlying storage implementation)
- 3% if insert/query/replication works correctly under no failures
- 3% if insert/query/replication works correctly under one failure
- 3% if node recovery works correctly
- 2% for the correct and clear description of your failure handling
- 2% for the correct and clear description of the consistency model your group tries to achieve and why
- 1% for the overall clarity of the rest of the description