DKVS: Distributed Key-Value Store --- The ring

Introduction

Waiting for the real network functionalities which will come up next week, we continue developing a few more abstractions, namely the ring. This week, you'll be implementing the functionalities required to manage multiple servers, but in a still simple usage scenario in which all servers store all data (W=N>1, R=1).

Since there are several servers, we can no longer do with a single (hard-coded) IP address. The addresses of the various servers will from now be read from a local file named servers.txt (use DKVS_SERVERS_LIST_FILENAME from config.h). This file contains the address and the port of each server, as well as the number of nodes this server offers, one line per server (UNIX line termination); address, port and node count separated by a space:

<IP> <Port> <# of Nodes>\n

(where \n does not have to be written explicitly, but symbolizes a line break).

For example:

127.0.0.1 1234 1
127.0.0.1 1235 1
127.0.0.1 1236 1

I. Revise get_nodes()

(Although short, this description may lead to several dozens lines of code.)

The first thing to do is to revise get_nodes() (in node_list.c) so as now to create (and add to the node list received as argument) as many nodes as described in the DKVS_SERVERS_LIST_FILENAME file. For this you have to read and parse that file, in the format described above. An IP address is a string of up to 15 characters (e.g. "123.122.121.229"), a port is a strictly positive uint16_t and the number of nodes is a strictly positive size_t. Have a look at the atouint{16,64}() functions defined in util.[ch].

Each node will simply be added in order while parsing the DKVS_SERVERS_LIST_FILENAME file. Its ID is simply its count starting from 1 for the first node added for a given line. For example, if the DKVS_SERVERS_LIST_FILENAME file contains:

127.0.0.1 1234 2
127.0.0.1 1235 1
127.0.0.1 1236 3

the node list will be added the following nodes in order:

IP address  port  node ID
127.0.0.1   1234    1
127.0.0.1   1234    2
127.0.0.1   1235    1
127.0.0.1   1236    1
127.0.0.1   1236    2
127.0.0.1   1236    3

You, of course, have to deal with possible error cases. Use ERR_IO whenever you encounter input/output error, ERR_INVALID_CONFIG if you get parsing errors (wrong or missing value), and of course report any other error code if you get some from internal calls. Also remember that, as stated last week, atouint{16,64}() functions accept negative writtings (and silently convert them to positive values). It's your responsability to detect negative writtings (starting with '-').

II. Ring implementation

As explained in the main description file, all clients must determine the positions of servers (= nodes) not from their line in the servers.txt file, but by calculating a SHA-1 value. Similarly, the positioning of a key will be done by calculating its SHA-1. The two SHA-1 values obtained will be directly interpreted as positions (integers) and directly compared.

To calculate the SHA-1 of a server node, compute the SHA-1 of the string "<IP> <Port> <Node-ID>" (with exactly one space between each value), where <Node-ID> ranges from 1 to the number of nodes indicated in the servers.txt file for the corresponding server.

To implement this behavior, you'll need to add several modifications:

  • a port and a SHA field to the nodes (node_t);
  • a way to compare (and then order) nodes according to their SHA;
  • the proper search function for a node in the ring (according to its SHA).

II.1 Nodes have a port and a SHA field

Since the port now becomes relevant (server "127.0.0.1 1234" is not the same as server "127.0.0.1 1235"), you should add a port field (uint16_t) to the struct node (in node.h) and assign it in node_init().

Then, add a field named sha to struct node so as to store the node SHA-value, as it was done in the warmup, except that here we make use of SHA_DIGEST_LENGTH (140 bits) rather than SHA256_DIGEST_LENGTH (256 bits).

Then, set this field in node_init() to the SHA1() of "<IP> <Port> <Node-ID>" as presented above (man SHA1 if necessary; or revisit the warmup; notice we here make use of the SHA1() function and not the SHA256() as it was the case in the warmup).

As seen in the warmup, to calculate SHA-1, you need to compile with the -lcrypto library. This option is already present in the provided Makefile.

II.2 Nodes are ordered according to their SHA field

(This whole subsection II.2 is really only a few lines of code.)

To be able to compare two nodes according to their SHA1, implement the node_cmp_sha() function (in node.c). This is really simple (man memcmp might help here).

Then to be able to order node lists, implement the node_list_sort() function in node_list.c. This is also really simple (man qsort might help here).

Finally, ring_init() has now, of course, to properly order the list of node (once read from servers.txt). For this, use node_list_sort() with node_cmp_sha() to put them in the right order for their use.

II.3 Searching for nodes

The heart of this work is the ring_get_nodes_for_key() function, the purpose of which is to return the list of N nodes (N passed as a parameter) matching a given key (refer to the main description document). But pay attention: all these nodes must be from different servers: skip nodes from already visited servers (same IP and same port; a different port on the same IP is considered as a different server).

III. Testing and debugging

As usual, we provide you with a few tests, and advise you to write your own.

To ease debugging, we here provide you the code of the node_list_print() tool function, which outputs a node list as such:

aa66f3e5a8d9cdc5c0bd49708bc59847e6915634 (127.0.0.1 1234)
ce5da604e4ef60dbe45042bfcb1c445c7db4ec81 (192.168.1.10 1235)
5b0d0f036d577a486c101c150238165f585fcafd (8.8.8.8 1236)
void node_list_print(const node_list_t *list)
{
    if (list == NULL) return;

    for (size_t i = 0; i < list->size; ++i) {
        const node_t * const node = list->nodes + i;

        for (size_t j = 0; j < SHA_DIGEST_LENGTH; ++j) {
            printf("%02x", node->sha[j]);
        }

        printf(" (%s %d)\n", node->addr, node->port);
    }
}

Feel free to adapt it to your own implementation.

You can either insert calls to this function in the tests or, much more practical, directly call it from gdb with call node_list_print(&ring), assuming you want to print ring.

get_nodes()

Since the get_nodes() function uses a file in the working directory of the program, we use Add_Test_With_Fixture() to create a test with two special functions, setup and teardown, which are always called before and after running the test. We supply the nope() (do nothing) and restore_path() which changes the working directory to the initial one. This allows the tests to change the directory to one in the tests/data/, each of which contain a different servers.txt. For example, the file data/servers-valid/servers.txt contains:

127.0.0.1 1234 1
192.168.1.10 1235 2
8.8.8.8 1236 3

which is a valid config and should yield 6 nodes:

aa66f3e5a8d9cdc5c0bd49708bc59847e6915634 (127.0.0.1 1234)
ce5da604e4ef60dbe45042bfcb1c445c7db4ec81 (192.168.1.10 1235)
f53e1ad3c8539aff77cecbdce77829ff1e40170d (192.168.1.10 1235)
5b0d0f036d577a486c101c150238165f585fcafd (8.8.8.8 1236)
b273a74003065f0112a1c30a87416ba3e1a5027b (8.8.8.8 1236)
11c3f81514637c68aef649a1b93577ba8662c699 (8.8.8.8 1236)

ring_get_nodes_for_key()

Here is an example run for the ring_get_nodes_for_key() function with the servers.txt

127.0.0.1 1234 3
127.0.0.1 1235 3
127.0.0.1 1236 3

Searching for the key "coucou" (SHA-1 5ed25af7b1ed23fb00122e13d7f74c4d8262acd8) should yield the following node_list_t (or a subset, depending on the value of N you used):

127.0.0.1 1236 (5f26268754fcf2a51fcfacaaa2aaf4f0d83f6d67)
127.0.0.1 1235 (914f6ade5b49a3a9be257f8a56bbde9a83fa46aa)
127.0.0.1 1234 (a902e3a5aa4f73150f459436b0580cb7ad72b566)