CSE 486/586 Distributed Systems Programming Assignment 3
Simple Key-Value Storage
Introduction
In this assignment, you will design a simple distributed key-value storage based on Chord. Although the design is based on Chord, it is a simplified version of Chord; you do not need to implement finger tables and finger-based routing; you also do not need to handle node leaves/failures.Therefore, there are three things you need to implement: 1) ID space partitioning/re-partitioning, 2) Ring-based routing, and 3) Node joins.
Just like the previous assignment, your app should have an activity and a content provider. However, the main activity should be used for testing only and should not implement any DHT functionality. The content provider should implement all DHT functionalities and support insert and query operations. Thus, if you run multiple instances of your app, all content provider instances should form a Chord ring and serve insert/query requests in a distributed fashion according to the Chord protocol.
References
Before we discuss the requirements of this assignment, here are two references for the Chord design:
- Lecture slides on Chord: http://www.cse.buffalo.edu/~stevko/courses/cse486/spring13/lectures/15-dht.pptx
- Chord paper: http://conferences.sigcomm.org/sigcomm/2001/p12-stoica.pdf
The lecture slides give an overview, but do not discuss Chord in detail, so it should be a good reference to get an overall idea. The paper presents pseudo code for implementing Chord, so it should be a good reference for actual implementation.
Note
It is important to remember that this assignment does not require you to implement everything about Chord. Mainly, there are three things you do not need to consider from the Chord paper.
- Fingers and finger-based routing (i.e., Section 4.3 & any discussion about fingers in Section 4.4)
- Concurrent node joins (i.e., Section 5)
- Node leaves/failures (i.e., Section 5)
We will discuss this more in “Step 2: Writing a Content Provider” below.
Step 0: Importing the project template
Just like the previous assignment, we have a project template you can import to Eclipse.
- Download the project template zip file to a directory.
- Import it to your Eclipse workspace.
- Open Eclipse.
- Go to “File” -> “Import”
- Select “General/Existing Projects into Workspace” (Caution: this is not “Android/Existing Android Code into Workspace”).
- In the next screen (which should be “Import Projects”), do the following:
- Choose “Select archive file:” and select the project template zip file that you downloaded.
- Click “Finish.”
- At this point, the project template should have been imported to your workspace.
- You might get an error saying “Android requires compiler compliance level...” If so, right click on “SimpleDHT” from the Package Explorer, choose “Android Tools” -> “Fix Project Properties” which will fix the error.
- You might also get an error about android-support-v4.jar. If so, right click on “SimpleDHT” from the Package Explorer, choose “Properties” -> “Java Build Path” -> “Libraries” and either fix the android-support-v4.jar’s path or replace it with your SDK’s correct android-support-v4.jar. (Courtesy of Justin and Sileem).
- Try running it on an AVD and verify that it’s working.
- Use the project template for implementing all the components for this assignment.
- The template has the package name of “edu.buffalo.cse.cse486586.simpledht“. Please do not change this.
- The template also defines a content provider authority and class. Please use it to implement your Chord functionalities.
- We will use SHA-1 as our hash function to generate keys. The following code snippet takes a string and generates a SHA-1 hash as a hexadecimal string. Please use it to generate your keys. The template already has the code, so you just need to use it at appropriate places. Given two keys, you can use the standard lexicographical string comparison to determine which one is greater.
import java.security.MessageDigest; import java.security.NoSuchAlgorithmException; import java.util.Formatter;
private String genHash(String input) throws NoSuchAlgorithmException { MessageDigest sha1 = MessageDigest.getInstance("SHA-1"); byte[] sha1Hash = sha1.digest(input.getBytes()); Formatter formatter = new Formatter(); for (byte b : sha1Hash) { formatter.format("%02x", b); } return formatter.toString(); } |
Step 1: Writing the Main Activity
Your app should have an activity used for testing. It should have three buttons, one button that displays “Test”, one button that displays “LDump” and another button that displays “GDump”; these buttons are already provided in the template. As with the previous assignment, “Test” button is already implemented (it’s the same as “PTest” from the last assignment) and you can use it for your testing purposes. Here are the requirements for the other two buttons.
- LDump
- When touched, this button should dump and display all the <key, value> pairs stored in your local partition of the node. Since you need to implement a distributed key-value storage based on Chord, each instance stores <key, value> pairs that belong to one partition in the Chord ring. This button should display all local <key, value> pairs on the screen.
- The order of <key, value> pairs you display does not matter as long as it shows all locally stored <key, value> pairs.
- GDump
- When touched, this button should dump and display all the <key, value> pairs stored in your whole DHT. Thus, LDump button is for local dump, and this button (GDump) is for global dump of the entire <key, value> pairs.
- Your content provider (below) and your activity should not communicate in any way except through the insert/query interface. Again, there should be no out-of-band communication between your content provider and your activity; your activity should strictly use the content provider interface to communicate with it.
Step 2: Writing the Content Provider
Along with the main activity, your app should have a content provider. This content provider should implement all DHT functionalities. For example, it should create server and client threads (if this is what you decide to implement), open sockets, and respond to incoming requests; it should also implement a simplified version of the Chord routing protocol; lastly, it should also handle node joins. The following are the requirements for your content provider:
- We will test your app with any number of instances up to 3 instances.
- The content provider should implement all DHT functionalities. This includes all communication as well as mechanisms to handle insert/query requests and node joins.
- Each content provider instance should have a node id derived from its emulator port. This node id should be obtained by applying the above hash function (i.e., genHash()) to the emulator port. For example, the node id of the content provider instance running on emulator-5554 should be, node_id = genHash(“5554”). This is necessary to find the correct position of each node in the Chord ring.
- At the minimum, your content provider should implement insert() and query(). The interface definition is the same as the previous assignment (see below), which allows a client app to insert arbitrary <”key”, “value”> pairs where both the key and the value are strings. However, please keep in mind that this “key” should be hashed by the above genHash() before getting inserted to your DHT in order to find the correct position in the Chord ring.
- This means that an app that uses your content provider will give arbitrary <key, value> pairs, e.g., <”I want to”, “store this”>; then your content provider should hash the key via genHash(), e.g., genHash(“I want to”), get the correct position in the Chord ring based on the hash value, and store <”I want to”, “store this”> in the appropriate node.
- Your content provider should implement ring-based routing. Following the design of Chord, your content provider should maintain predecessor and successor pointers and forward each request to its successor until the request arrives at the correct node. Once the correct node receives the request, it should process it and return the result (directly or recursively) to the original content provider instance that first received the request.
- Your content provider do not need to maintain finger tables and implement finger-based routing. This is not required.
- As with the previous assignment, we will fix all the port numbers (see below). This means that you can use the port numbers (11108, 11112, & 11116) as your successor and predecessor pointers.
- Your content provider should handle new node joins. For this, you need to have the first emulator instance (i.e., emulator-5554) receive all new node join requests. Your implementation should not choose a random node to do that. Upon completing a new node join request, affected nodes should have updated their predecessor and successor pointers correctly.
- Your content provider do not need to handle concurrent node joins. You can assume that a node join will only happen once the system completely processes the previous join.
- Your content provider do not need to handle insert/query requests while a node is joining. You can assume that insert/query requests will be issued only with a stable system.
- Your content provider do not need to handle node leaves/failures. This is not required.
- We have fixed the ports & sockets.
- Your app should open one server socket that listens on 10000.
- You need to use run_avd.py and set_redir.py to set up the testing environment.
- The grading will use from 1 AVD to 3 AVDs. The redirection ports are 11108, 11112, and 11116.
- You should just hard-code the above 3 ports and use them to set up connections.
- Please use the code snippet provided in the PA1 description on how to determine your local AVD.
- avd0: “5554”
- avd1: “5556”
- avd2: “5558”
- Any app (not just your app) should be able to access (read and write) your content provider. As with the previous assignment, please do not include any permission to access your content provider. You can test this with DHTTester.apk.
- Your content provider’s URI should be: “content://edu.buffalo.cse.cse486586.simpledht.provider”, which means that any app should be able to access your content provider using that URI. Your content provider does not need to match/support any other URI pattern (though for your dump buttons, you might want to match something like “content://edu.buffalo.cse.cse486586.simpledht.provider/*”, but this is not required and up to you).
- Your provider should have two columns.
- The first column should be named as “key” (an all lowercase string without the quotation marks). This column is used to store all keys.
- The second column should be named as “value” (an all lowercase string without the quotation marks). This column is used to store all values.
- All keys and values that your provider stores should use the string data type.
- Since the column names are “key” and “value”, any app should be able to insert a <key, value> pair as in the following example:
ContentValues keyValueToInsert = new ContentValues();
// inserting <”key-to-insert”, “value-to-insert”> keyValueToInsert.put(“key”, “key-to-insert”); keyValueToInsert.put(“value”, “value-to-insert”);
Uri newUri = getContentResolver().insert( providerUri, // assume we already created a Uri object with our provider URI keyValueToInsert ); |
- Similarly, any app should be able to read a <key, value> pair from your provider with query(). Since your provider is a simple <key, value> table, we are not going to follow the Android convention; your provider should be able to answer queries as in the following example:
Cursor resultCursor = getContentResolver().query( providerUri, // assume we already created a Uri object with our provider URI null, // no need to support the projection parameter “key-to-read”, // we provide the key directly as the selection parameter null, // no need to support the selectionArgs parameter null // no need to support the sortOrder parameter ); |
Thus, your query() implementation should read the selection parameter and use it as the key to retrieve from your table. In turn, the Cursor returned by your query() implementation should include only one row with two columns using your provider’s column names, i.e., “key” and “value”. You probably want to use android.database.MatrixCursor instead of implementing your own Cursor.
- Note that your content provider should only store the <key, value> pairs local to its own partition.
Design Document
You need to write a design document of up to 2 pages (12 pt font in .pdf). This should include:
- What components have you written?
- What does each component do? Please include figures if necessary when describing each component.
- How do you update predecessor and successor pointers? Please include an English description as well as pseudo-code.
Submission
We use the CSE submit script. You need to use either “submit_cse486” or “submit_cse586”, depending on your registration status. If you haven’t used it, the instructions on how to use it is here: https://wiki.cse.buffalo.edu/services/content/submit-script
One again, you need to submit three separate files described below. You must follow everything below exactly. Otherwise, you will get no point on this assignment.
- Your app’s .apk: The name should be SimpleDHT.apk.
- Your design document in .pdf: The name should be SimpleDHT.pdf. Please do not submit a .docx or .txt file.
- Your entire Eclipse project source code tree zipped up in .zip: The name should be SimpleDHT.zip. To do this, please go to your Eclipse workspace directory, find your app, and zip it. Please do not use any other compression tool other than zip, i.e., no 7-Zip, no RAR, etc. Please make sure to zip the top-level directory that contains all the source, i.e., unzipping should create one “SimpleDHT” directory that contains all the source files.
Deadline: 3/29/13 (Friday) 2:59pm
This is right before class @ 3pm. The deadline is firm; if your timestamp is 3pm, it is a late submission.
Grading
This assignment is 15% of your final grade. The breakdown for this assignment is:
- 1% if the LDump button works correctly (apart from the underlying insert/query over the DHT)
- 1% if the GDump button works correctly (apart from the underlying insert/query over the DHT).
- 5% if insert and query operations work correctly (with a static/stable membership)
- 3% if node join works correctly
- 3% for the correct and clear description of your predecessor/successor update procedure
- 2% for the overall clarity of the rest of the description