DKVS: Distributed Key-Value Store --- local server (S=N=R=W=1)
Get updates
To get the new material on your project repository, you have to:
-
check that you don't already have the upstream reference repository:
git remote -v
if you see only two lines or if you don't see
projprogsys-epfl/cs202-25-project.git
in any line, you don't have it and must add it (next step); -
only once: add the upstream reference repository (I though this was automatically donc by Github Classroom, but it seems not):
git remote add upstream git@github.com:projprogsys-epfl/cs202-25-project.git
check it was properly added by doing again
git remote -v
From now on, every time you want to fetch updates from us, first be up-to-date with your own code (i.e. you pushed all your commits) and then do (only one of you has to do that, the other will simply git pull
as usual):
git fetch upstream
git merge upstream/main
git push
Introduction
The aim of this week is to setup the client-side command-line tool ("program", in other words) to interact with the DKVS infrastructure (review the general presentation document if necessary). Even though this week infrastructure is quite simple (a single server, N=W=R=1), it's important for your efficiency to organize yourself well, module by module with the broad view in perspective and to systematically test your implementations.
So this week's goal is to achieve a local infrastructure only, on a single server, without redundancy, using the hash tables implemented last week. Since networking will not be implemented before week 10 (so that you have seen the proper concepts in class), we will, for now, fake the network interactions. Instead of actually sending network requests to the nodes, the operations will use a local hashtable to simulate network interactions.
We here start by describing the higher-level tools before going on to describe the lower-level implementation modules. As this week's work is quite substantial, it's important to divide it up between the two of you; for example, one of you can deal with the argument parsing (args.[ch]
, parse_opt_args()
and client.c
), while the other deals with "network" (fake) modeling and interactions (node
, node-list
, ring
and dkvs-client get
and dkvs-client put
). But this is only one possible example of organization among others.
I. General description of the command line tools
The command-line tool we start implementing this week is named dkvs-client
. The general syntax of dkvs-client
is:
dkvs-client <command> [-r R] [-w W] [-n N] -- ...
There is only one mandatory argument, <command>
. We will for now focus only on two commands: put
and get
, detailed below. The three optional flags -r
, -w
and -n
take a strictly positive integer as parameter (more explainations later). Then a list of specific arguments can be provided, to be placed after --
. Some simple examples are provided below.
The put
command inserts a key-value pair in the store. For example, to add the following two key-value associations
a -> "hi a"
b -> "hello b"
we will do:
> ./dkvs-client put -- "a" "hi a"
OK
> ./dkvs-client put -- "b" "hello b"
OK
If some key already existed in the table, the put
command simply overwrites new value. For example, if the above table was containing
a -> "hello a"
c -> "hi c"
before the above two put
commands, then it contains:
a -> "hi a" b -> "hello b" c -> "hi c"
after the above two put
commands.
The get
command fetches key value in the store. For instance, with the former table:
> ./dkvs-client get -- "a"
OK hi a
> ./dkvs-client get -- "b"
OK hello b
> ./dkvs-client get -- "d"
FAIL
> ./dkvs-client get -- "some key"
FAIL
II. Low-level modules
We have modularized the infrastructure for this project as follows (review the general presentation document if necessary):
args.[ch]
: managing command line arguments;client.[ch]
: providing initialization and termination functions for a client;- networking:
network.h
: interface for client-server communication;fake-network.c
(provided): a fake implementation of client-server communication functions forput
(write key-value) andget
(read value) requests; you can adapt it for your own tests, see the dedicated section below (II.6 Fake network);
- DKVS:
ring.[ch]
: the key ring abstraction; in this project, it'll be implemented as a dynamic array of nodes;node_list.[ch]
: a dynamic array of nodes;node.[ch]
: a key ring node, i.e. one storage place for key-values pairs.
II.1. Arguments (args.[ch]
)
This module handles the parsing and storage of command line arguments. It exposes a struct args
, and its typedef args_t
, as well as the function parse_opt_args()
. See args.h
.
If you didn't yet, we recommand you have a look at the two complement lectures (in French):
In a file args.c
(to be created), you have to implement the function parse_opt_args()
(see prototype in args.h
), which initializes the args_t
according to optional arguments, removes them and the "--"
from the argument list and leaves all the other following arguments (in the argument list). Of course, argc
shall be updated consistently.
supported_args
is an information value to indicate what type of argument we want to have. It's making use of arg_kind
bit flags. For instance, if we want to have N and W for a specific command, we would call
parse_opt_args(&args, TOTAL_SERVERS | PUT_NEEDED, &argc, &argv);
Note: even though the -r
, -w
and -n
flags are optional, they must be initialized using default values if not present (corresponding default values are provided in args.h
).
The following constraints must be enforced:
- arguments not present in the
supported_args
must be ignored - if no value is given for
N
, then it should take the maximum value between the arguments given forW
andR
, or theDKVS_DEFAULT_N
if no argument is given - if not given as arguments,
W
andR
should be set to the minimum ofN
and their respective default value R <= N
andW <= N
.
An error code shall be returned to indicate whether an error occurred, either:
ERR_NONE
: no errors;ERR_INVALID_ARGUMENT
: if one of the pointer given toparse_opt_args()
isNULL
; this check should be present in all your function and we will not repeat it afterwards;ERR_INVALID_COMMAND
: if the command line arguments do not match the above syntax.
II.2. Node (node.[ch]
)
The node
layer is quite simple and only serve to better structure the code in layers. It represent the ring node abstraction which, for the moment and up to week 11, will simply be one server.
For this week, simply define the struct node
to contain one adress (an immutable string).
In node_init()
simply assign the address. port
and node_id
are unused for the moment.
node_end()
contains what should be done when a node comes to an end. Certainly nothing for the moment.
There is no need to do the two other functions (node_cmp_...
) for the moment.
II.3. Node list (node_list.[ch]
)
A node list will be usefull to represent the ring.
Define the type struct node_list
as a dynamic array of nodes.
Then define node_list_add()
and node_list_free()
(usual functions for dynamic arrays).
The only unusual function w.r.t. dynamic arrays is the initialization: here the initialization function is named get_nodes()
and will be a bit more complex once we have the real network (getting all the nodes from the ring). For the moment, simply create a node
, with DKVS_DEFAULT_IP
(its address), DKVS_DEFAULT_PORT
(unused) and 0 id (unused); and add it to the list.
No need to define node_list_sort()
for the moment.
II.4. Ring (ring.[ch]
)
In this project, a ring is simplified to be simply reduced to the list of nodes (i.e. a dynamic array of nodes, node_list
).
Simply define ring_init()
and ring_free()
appropriately.
II.5. Client (client.[ch]
)
The "client" module is not very developed. It contains only two functions: initialization and end-of-use of a client.
You must first define the client_t
type in client.h
, which for the moment only contains the arguments parsed from the command line (args_t
) and a ring.
Regarding client.c
:
- for initialization, it needs to parse the command line arguments received as parameter and initialize its ring;
- for the end of a client, you only need to free the resources used by the client (for the moment: its ring).
Those functions also return error codes indicating whether they succeeded. If they call a function which returns an error, this error should be returned.
Note: This behavior should be present in all your code and will not be repeated.
II.6. Fake network
The network layer is the highest tool layer, modeling client-server communication. It offers two functions for managing these communications: network_put()
for request exchange during key-value writing, and network_get()
for value reading. See network.h
for the interface (which will remain the same for real network interactions).
For now, we provide you with a fake implementation of this layer in fake_network.c
, fake1.h
and fake2.h
. It uses a local hashtable, therefore changes are not persisted across multiple calls.
Each fake%.h
file contains a macro definition for MAKE_FAKE_NETWORK
, which is used to initialize the key-value store before running the command. When compiling the project, for all fake%.h
; it will generate a corresponding fake-dkvs-client-%
, which runs the command on its key-value store.
Notice that to make the fake-network easily testable, you have to implement the Htable_print()
function in hashtable.c
. It prints all the content of a table, using the following format (the order does not matter):
key1: value 1
key2: value 2
...
for instance:
a: hello a
bb: hello bb
ab: hello ab
bbb: hello bbb
where "a
", "bb
", "ab
" and "bbb
" are keys, and "hello xxx
" their corresponding values (see fake2.h
).
Its pseudo code is:
void Htable_print(const Htable_t* table)
{
if (table == NULL) return;
if (table size == 0) { // adapt to your table size
puts("<Empty HashTable>");
return;
}
for (all bucket in table) {
bucket_t* bucket = ... //adapt to your code
for (all key value pairs in bucket) { //adapt to your code
printf("%s: %s\n", key, value);
}
}
}
III. Implementing command-line tools
III.1. dkvs-client
commands
cli_client_get()
needs to be implemented in dkvs-client-get.c
, and cli_client_put()
in dkvs-client-put.c
. Their prototype and documentation are available in dkvs-client-cmds.h
. They receive as parameter a client_t
, and their arguments via argc
and argv
:
remember (from argcs.c
and parse_opt_args()
descriptions above) that the argument(s) they need (provided on the command line after --
) remain in argv
.
Use the appropriate functions from network.h
to communicate with the DKVS ring.
In case of invalid arguments, these functions shall return, as usual, the ERR_INVALID_ARGUMENT
error code.
Neither a key nor a value shall not be longer than MAX_MSG_ELEM_SIZE
characters.
cli_client_get()
will print "OK <value>
" if it got a value from the network and "FAIL
" if not (see examples above). It will return the error code it code from the network call.
cli_client_put()
behave the same way, except it simply prints "OK
" (no value) in case of successful put.
III.2. main()
The role of the main()
function, to be put in dkvs-client.c
, is simply to receive the command line arguments and call the corresponding function from dkvs-client-cmds.h
(cli_client_get()
if the argument is "get"
and cli_client_put()
if it's "put"
). Of course, everything needed shall be properly initialized (use TOTAL_SERVERS | GET_NEEDED
flags for get
and TOTAL_SERVERS | PUT_NEEDED
flags for put
; and remember that get
requires one argument (the key) whereas put
requires two).
Have a look a the provided code and complete the commands
array with the appropriate values so as to have the get
and put
command properly handled.
III.3. fake-dkvs-client
executable
To make use of the provided fake_network
(or any other of your own), dkvs-client
has to be linked with it.
If you didn't update your Makefile
already, please do by importing it from provided
. As said above (II.6 Fake network), you will have a separate fake-dkvs-client-%
for each fake%.h
. Since we provide you with fake1.h
and fake2.h
, you should have two fake executables. If you take a look at fake1.h
, you'll see that it contains an empty key-value store; while fake2.h
has a few entries.
You can test this with:
$ ./fake-dkvs-client-1 get -- a
...
==== end of fake network init =================
FAIL
ERROR: Not found
...
$ ./fake-dkvs-client-2 get -- a
...
==== end of fake network init =================
OK hello a
Feel free to add as many fake%.h
as you want for your own tests; the Makefile
will automatically detect them and build the corresponding executables.
IV. Testing and debugging
Unit tests
Similarly to last week, we provide you with a few unit tests in provided/tests/unit
. We strongly recommend you to add your own tests after copying them into your done/
directory.
As a reminder, you can run those tests by using make
from the done/
directory:
make unit-tests
: run all the unit tests, stopping on the first errormake all-unit-tests
: run all the unit tests, ignoring failuresmake test-%
: run the unit tests%
, contained in the file(provided|done)/tests/unit/unit-test-%.c
If a unit test fails, it will print a make dbg ...
command which, when ran from the done/
directory, will start gdb
in the failing test.
End-to-end tests
Since you now have fully working executables, we'll start using end-to-end tests; i.e. tests that check the functionality from a user perspective. You can find them in provided/tests/end-to-end
.
Their workflow is rather simple: they start the executable with some parameters and check that its output is correct. The tests are written in Python using the native unittests
library; we also provide a class DKVSTests
with a few utilies to run a (potentially fake) executable and check its prints and return code.
Again, from the done/
directory, you can use the following make
commands:
make end2end-tests
: run all the end-to-end tests, stopping on the first errormake all-end2end-tests
: run all the end-to-end tests, ignoring failures
Upon failure, the test will print you a list of commands that you can use to reproduce the issue.
All together
On top of the make
targets detailed above, you can use these to test everything at once:
make check
: run all the tests, stopping on the first errormake all-tests
: run all the tests, ignoring failures
Once everything is working, we also provide a make feedback
(make feedback-VM-CO
if you're working on EPFL VMs) which gives partial feedback on your work. This is normally used for a minimal final check of your work. It's much better to run local tests directly on your machine beforehand (including more tests you've added yourself, if necessary) as described above.
The Docker image used by make feedback
will be tagged latest
every week. If you want to run feedback for a specific week, change (in the Makefile
at the line that defines IMAGE
) this latest
tag to weekNN
where NN
is the desired week number, e.g.:
IMAGE=chappeli/cs202-feedback:week06
V. Advice
WARNING: there are many possible sources of error in all these functions. You must systematically check all arguments, as well as the return values of tool functions used in a higher-level function. It's worth imagining that each function can be tested separately with "bizarre" inputs (which we'll do). If you have any questions, don't hesitate to ask during the exercise sessions.
We strongly advise you to work regularly and systematically to make weekly tags (if your corresponding work is operational) in order to benefit from our test reports.