DKVS: Distributed Key-Value Store --- Network layer
Introduction
The purpose of this week is to add real network functionalities (not fake anymore) to our project. For this, you will:
- create a socket layer for network communications:
socket_layer.h
andsocket_layer.c
; - use that layer to create a dummy UDP client-server, just to see/create a very simple example:
udp-test-client.c
andudp-test-server.c
(to be created); - use that layer to have an actual client-server DKVS in your project.
For the later step, this week we only setup the basic client-server functionalities, not the whole ring protocol. That part will be achieved next week.
First project deliverable (graded step)
But before all of the above, the work done so far (weeks 5 to 9) had to be delivered for grading. So don't forget to submit it before the deadline: this Sunday, May 04th, 23:59.
The easiest way to submit is to do
make submit
from your done/
directory. This simply adds a project01_1
tag to your commit (in the main
branch).
Although you can do as many make submit
as you want, we really recommend you to do it only when you are sure you want to deliver your work.
Tasks
I. Socket layer
We here focus on the transport layer (UDP), simply using standard Unix sockets in C to provide the basic functions required for network communication in this project. There are provided in socket.[ch]
:
get_socket()
to obtain a network socket (internal program representation) for communication, specifying its waiting time ("timeout", in seconds);get_server_addr()
to obtain the address, in the "internal object" sense (struct sockaddr_in
), of a given IP address and port;bind_server()
to associate a network communication (external representation: IP address, port) with a network socket (internal representation).udp_server_init()
, to initialize a network communication over UDP;udp_read()
, to create a call that reads the active socket once and stores the output inbuf
;udp_send()
to send a response message.
Most of these functions are simply interfaces to sys/socket.h
C functions socket(2)
, socket(7)
, setsockopt(2)
, bind(2)
, recv(2)
, recvfrom(2)
and sendto(2)
. We strongly recommend you have a look at the corresponding man-pages. You will also need to look at inet_pton(2)
, htons(2)
and close(2)
.
Note that with the network protocol we use (UDP), there is no guarantee that messages will be delivered. If the server response has not arrived within the allotted time (which is set using the get_socket()
function from socket_layer
), consider the request to have failed (and return ERR_NETWORK
).
To send requests to the server, use the sendto()
function (man sendto
). To read requests received back from the server, use the recvfrom()
function (man recvfrom
).
A few tips:
- read the documentation of the functions (in
socket_layer.h
) before implementing them; - the port number needs to be converted using
htons(2)
for portability; - you can safely cast a
struct sockaddr_in*
to astruct sockaddr*
, and vice versa.
II. First simple test
Test framework
NOTE: this is both an exercise for the network lectures (L12 mainly) and a simple test (and debug) case before adding network functionalities to the main project.
We strongly recommend that both members of the group work on this part (so as to both practice the lectures concepts and be well aware of what is available/how to use it before applying it to the main project).
Test your socket_layer
implementation by creating two simple programs (see detailed usage examples below):
-
a client (
udp-test-client.c
) that:- asks for a
unsigned int
(onstdin
, see example below); - sends it (over UDP) to
CS202_DEFAULT_IP
, portCS202_DEFAULT_PORT
; - waits for positive acknowledgment;
- and properly terminates.
- asks for a
-
a server (
udp-test-server.c
) that:- waits for connections on
CS202_DEFAULT_IP
, portCS202_DEFAULT_PORT
; - convert received content to an
unsigned int
; - send a response back to the sender;
- should exit properly in case of read error.
- waits for connections on
Unless errors, the server never terminates, as it may have to serve several clients/requests.
Example
This is just an example. You are completely free to code the client and the server the most appropriate way for you (to understand and to debug) provided that they fulfill the two 4-items bullet lists above. In particular, you're free to choose the messages you'd like to be displayed on the terminal.
Server (in one terminal):
./udp-test-server
Server listening on 127.0.0.1:1234
### [AFTER THE CLIENT INTERACTION BELOW]
Received message from 127.0.0.1:47601: 213
Sending message to 127.0.0.1:47601: 214
...
Client (in another terminal):
./udp-test-client
What int value do you want to send? 213
Sending message to 127.0.0.1:1234: 213
Received response: 214
You can launch the client several times, with different values.
Terminate the server with Ctrl-C once done.
Use Wireshark to debug
Use Wireshark to debug your code.
Try many clients at the same time:
for i in $(seq 15); do echo $i | ./udp-test-client > log-$i 2>&1 & done
What happens? (maybe nothing particular, actually)
III. Modifying existing client code
-
in
client.h
, add a socket to thestruct client
; -
in
client.c
, updateclient_init()
to open the client socket (seesocket_layer.h
; returnERR_NETWORK
in case of error); and, of course, updateclient_end()
accordingly; -
in
node.h
, add astruct sockaddr_in
namedaddr_s
tostruct node
for the actual network protocols; for the sake of simplicity and backward compatibility, keep the former IP address and port fields, although they are now useless; there is no need to check for the integrity between these former fake fields and the newstruct sockaddr_in
; -
in
node.c
, initialize that new field appropriately usingget_server_addr()
; -
in
ring.c
, adapt the node comparison inring_get_nodes_for_key()
with:memcmp(&list->nodes[j].addr_s, &ring->nodes[i].addr_s, sizeof(struct sockaddr_in))
Also, in the provided tool function node_list_print()
(in node_list.c
), change
node->addr, node->port
for
inet_ntoa(node->addr_s.sin_addr), ntohs(node->addr_s.sin_port)
IV. DKVS client-server
DKVS client-server protocol
dkvs-server
launches a server on the address specified as its first argument and a port specified as its second argument; for example:
./dkvs-server 127.0.0.1 1236
Although we will not make use of this possibility this week, we could also pass pairs of optional initial key-value associations; for instance:
./dkvs-server 127.0.0.1 1236 key1 value1 key2 value 2
meaning that this server will already store the two key-value pairs: ("key1"
, "value1"
) and ("key2"
, "value2"
), exactly as if the corresponding two put
commands had been done.
dkvs-server
receives no input from stdin
and produces no output to stdout
nor stderr
(unless in debug mode, using debug_printf()
when needed; see error.h
; to make it active, compile with make DEBUG=1
or make DEBUG=1 <some target[s]>
).
On the other side, the client (dkvs-client.c
) is already written. All we have to do (which is a big piece of work) is "simply" to write the network layer (network.c
) to replace the former fake networks.
DKVS client and server message exchange is made up of strings of potentially different lengths; even two concatenated strings in the case of put
, with the null character \0
serving as a separator (the concatenation we're here talking about is thus a byte concatenation at the network protocol level, not a C-string concatenation, which wouldn't make any sense here).
- Put-requests will therefore have the following format: "
<key>\0<value>
"; - get-requests the format: "
<key>
" - and responses to get-requests the format "
<value>
";
where <key>
and <value>
represent the character string (in the common sense) of, respectively, the key and the value, without any final null character (that's what we mean by "common sense character strings", as opposed to C-strings).
For example, a request to write (= put) the value "xy
" for the key "abc
" will send the six-bytes sequence "abc\0xy
" to the network, without any final \0
(which would otherwise be a seven-bytes sequence).
The response to a read (= get) request for this same key (once written) will send "xy
" (2 bytes) and not "xy\0
" (3 bytes; as a reminder, the total number of bytes exchanged at network level is known, for example as the return value of recvfrom(2)
).
So be very careful when converting these keys/values into C strings not to forget the terminal null characters, when needed.
As these keys and values can in principle contain anything, they could become very long. In order to limit the impact on network communications (and possible failures at lower protocol levels), we have decided to limit the length of each of these elements (keys and values) to MAX_MSG_ELEM_SIZE
(defined in config.h
). Thus, if network_get()
or network_put()
receive arguments that are too long, they must exit with the ERR_INVALID_ARGUMENT
error.
NOTES:
- If useful, we have also defined the constant
MAX_MSG_SIZE
(inconfig.h
) to represent the maximum size of a network message, during aput
command ("<key>\0<value>
"):2 * MAX_MSG_ELEM_SIZE + 1
. - To search for a particular character, including the null character, in any sequence of characters, use the
memchr()
function (man memchr
, similar tostrchr()
, but which doesn't stop at the first\0
encountered).
[end of notes]
Another aspect to pay attention to concerns non-existent keys. This case is managed explicitly by responding with a null character ('\0'
) when a request is made to read a non-existing key. Since it's possible to associate empty values (empty string) with keys, it's important to understand the difference between replying an empty value (empty string) associated with an existing key and replying to a non-existing key:
- the network message sent in response to a get-request for a known key which has an empty value will simply be empty (no characters, 0 byte; this is consistent with the usual value replies: the byte-size of the network reply message is exactly the length of the value);
- while the message sent in response to a read request for an unknown key will be the empty string in the sense of C, i.e. the sequence of characters reduced to the single null character (
\0
, and therefore of length 1 byte).
This is the same conceptual difference as between { "", 1 }
and { NULL, 0 }
for a struct foo { const char* ptr; size_t total_size; };
.
DKVS client-ring protocol
Regarding the ring behavior, for this week the client will simply request to all the servers one after the other. Since this week no server will reply (this will be implemented next week), this loop will simply be used in debug mode to test the message exchange and see which server receives (and sends) what. The more advanced ring protocol (as described in the main description file) will only be implemented next week.
network.c
Now it's time to create the communication protocols in network_get()
and network_put()
in the network.c
file. First have a look at this provided file and see how we already decomposed the tasks.
Let's start with the get command (network_get()
):
-
this command has first to collect all the nodes that store the provided key (recall what you did in week 09); of course, if it fails it should return immediately;
-
then, it must contact the servers one after the other (in the order given by their SHA; recall what was done in week 09), but stops as soon as the first read success is achieved (which won't be the case this week since the servers won't reply);
to "contact" a server, it must first do the key request usingserver_get_send()
tool function, and then, in case of success, it must get the server reply usingserver_get_recv()
.
To finalize the get command, you thus need to:
- complete
server_get_send()
: this simply send the key to the server; it returnsERR_NONE
if the sent length equals the key length andERR_NETWORK
otherwise; - complete
server_get_recv()
: this will be done next week; for the moment the function does nothing more than what is provided.
Regarding the put command (network_put()
):
-
it should first collect all the nodes that store the provided key;
-
then, it must contact each server (
server_put_send()
see the description below) one after the other, then waiting for a (write) acknowledgement or timeout before moving on to the next; if any of the servers fails to write, it must fail BUT it must nevertheless try to write to each of the servers, i.e. continue to (try to) write to the other servers before finally displaying its failure.
To finalize the get command, you thus need to complete server_put_send()
: this should create the client sending message as explained above (the null-separated byte-concatenation of key and value) and send it; it returns ERR_NONE
if the sent length is the appropraite length, ERR_NETWORK
otherwise.
dkvs-server.c
The last thing to be done is to complete dkvs-server
. As usual, first have a look at the provided file. There you'll have to:
- assign the port from
argv
(second argument); as usual, properly handle the error cases (this will not be repeated); - launch the UDP server;
- in the server loop:
- read the request message;
- optionnaly print debuging messages in debug mode;
- if the request does not contain any
'\0'
, do aserver_get()
;memchr()
might help here; - otherwise, if the request is the empty string, return
ERR_NOT_FOUND
for this week (we might change that next week); - otherwise do a
server_put()
with the appropriate parameters;
- and, of course, do all the appropriate closing/garbage collecting that must be made anywhere appropriate.
IV. Tests
It will of course be important to test your code thoroughly, step by step, in terms of the new network protocols.
We recommend you to use Wireshark to debug your code. You can also make use of debug_printf()
(see error.h
).
To make it active, compile with make DEBUG=1
(or make DEBUG=1 <some target[s]>
).
One server
Have only 1 server in servers.txt
, for instance:
127.0.0.1 1236 1
and launch that server in one terminal:
./dkvs-server 127.0.0.1 1236
Then, in another terminal, first try to get the value for some key, e.g.:
./dkvs-client get -- somekey
You should receive:
server_get_send(): asking for key "somekey" to 127.0.0.1:1236
server_get_recv(): read "" (size: -1)
FAIL
ERROR: Network error
[...]
And on the server side, you should see:
Server listening on 127.0.0.1:1236
Received: "somekey" (size: 7)
server get for key "somekey"
Then you can try to add a new (key, value) pair, for instance:
./dkvs-client put -- somekey somevalue
You should receive:
server_put_send(): sending "somekey" --> "somevalue" to 127.0.0.1:1236
network_put(): got reply: -1
FAIL
ERROR: Network error
[...]
And on the server side, you should see:
Received: "somekey", "somevalue" (size: 17)
server put for "somekey" --> "somevalue":
Several servers
Now try with a few more servers, for instance with servers.txt
containing:
127.0.0.1 1234 1
127.0.0.1 1235 1
127.0.0.1 1236 1
which should lead to the same output as above.
NOTICE HOWEVER that the situation is a bit different here: the ring now has 3 servers, the first of which is NOT the one which will reply (1234
as opposed to 1236
). This is because of the SHA positionning in the ring: the first servers to actually be contacted is indeed on port 1236
and not 1234
.
You can also try:
./dkvs-client put -- somekey4 somevalue2
which should send to 127.0.0.1:1235
(rather than to 127.0.0.1:1236
), and
./dkvs-client put -- somekey12 somevalue3
which should send to 127.0.0.1:1234
.