ImgFS: Image-oriented File System --- Network layers: socket layer
Introduction
This week we start a new aspect of the project: adding HTTP access (server and client) to our Image Database. Basically, we want to convert our imgfscmd
application to a client-server application that uses HTTP (over TCP as its transport-layer protocol).
This work be structured as follows over the next three weeks:
-
this week: create a socket layer for network communications; and use that layer to create a simple HTTP server (to be made more complex next week);
-
next week: create a (simplified) HTTP layer over the socket layer that contains all the functionalities needed for this project (mainly: parse HTTP requests designed for this project), but in a blocking way (handles only one connection at a time);
-
and in the last week, create a server that can serve (!) our image database commands (read, insert, ...) through HTTP access; and use it via an HTTP client; and in a non blocking way (multiple connections via a multi-threaded program).
We thus have three logical layers, each of which shall be tested on its own:
-
the socket layer, to be tested with
tcp-test-client.c
andtcp-test-server.c
(to be done); -
the "generic" (but incomplete) HTTP layer, to be tested with
http-test-server.c
(provided) andcurl
; -
the ImgFS-over-HTTP layer, to be tested with
imgfs_server
and eithercurl
(early tests) or a browser, usingindex.html
(already provided).
For this week, we focus on the transport layer (TCP), simply using standard Unix sockets in C to provide the four following functions (see socket_layer.h
):
-
tcp_server_init()
, to initialize a network communication over TCP; -
tcp_accept()
, to create a blocking call that accepts a new TCP connection; -
tcp_read()
, to create a blocking call that reads the active socket once and stores the output inbuf
; -
tcp_send()
to send a response message.
Most of these functions are simply interfaces to sys/socket.h
C functions socket(2)
, bind(2)
, listen(2)
, accept(2)
, recv(2)
and send(2)
. We strongly recommend you have a look at the corresponding man-pages.
We then use that layer to create a simple HTTP-server API. There, you'll have to implement two functions:
-
http_receive()
, to create a call and read from it; -
http_reply()
to send a response message.
http_init()
, to initialize an HTTP communication, and http_close()
, to close it, are provided.
The fifth function that appears in http_net.h
, http_serve_file()
, will be implemented later.
Provided material
In the provided/src
directory, you can find the following files (some of which have certainly already been copied to your done/
):
socket_layer.h
: prototypes of thetcp_*()
functions, which interact with UNIX socket and serve as basis for our HTTP web server;http_net.h
: prototypes of the HTTP layer, responsible for receiving incoming requests, and generating HTTP responses;http_prot.h
: parse HTTP requests;imgfs_server_service.h
: core functions of theimgfs
HTTP server: sets up and shutdown server, dispatch requests;http_net.c
: implementation of the HTTP layer,imgfs_server.c
: the main code of our server,imgfs_server_service.c
: the implementation of the core functions to offer HTTP services to our ImgFS database;http-test-server.c
: a simple test of the HTTP layer.
Tasks
Socket layer
tcp_server_init()
In a file socket_layer.c
(to be created), define the tcp_server_init()
function (see its prototype in socket_layer.h
) which:
- creates a TCP socket (see
socket(2)
man-page; useAF_INET
andSOCK_STREAM
); - creates the proper server address (
struct sockaddr_in
type); notice that for portability, the port number received as argument shall be converted usinghtons()
(seehtons(3)
man-page); - binds the socket to the address (see
bind(2)
); note: there is no problem passing a pointer to astruct sockaddr_in
as a pointer to astruct sockaddr
; - and starts listening for incoming connections (see
listen(2)
);
The function returns the socket id.
Whenever an error is encountered, this function prints an informative message on stderr (see perror(3)
), closes what should be, and returns ERR_IO
. Sockets must be closed using close(3)
.
tcp_accept()
The tcp_accept()
function (to be defined also in socket_layer.c
) is simply a (one line of code) frontend to the accept(2)
function.
We don't make any use of the addr
and addr_len
arguments of accept()
(use NULL
).
This function returns the return value of accept()
.
tcp_read()
and tcp_send()
Similarly, tcp_read()
and tcp_send()
are also frontends to recv(2)
and send(2)
functions, respectively. They return either ERR_INVALID_ARGUMENT
if they received an improper argument, or the return value of the system function called.
First simple test
Test framework
To test your implementation by creating two simple programs (see usage examples below):
-
a client (
tcp-test-client.c
) that takes two arguments from the command line: a port (number) and a (short) file; -
a server (
tcp-test-server.c
) that takes one argument from the command line: a port (number).
The client test if the file exists and has a size less than 2048. If it's the case, it:
- sends the length to the server;
- waits for positive acknowledgment;
- then it sends file to the server;
- waits for acknowledgment
- and then stops.
The server waits for connections and when it receives a file (length first):
- sends acknowledgment message telling whether the size is smaller than 1024 (or not);
- if not it returns to start;
- if yes it waits for a file (of that size) and prints it's content,
- then send acknowledgment,
- and starts the whole loop again.
The server never terminates, as it may have to serve several clients/requests.
Important point
You need to make sure that the two ends of the communication will never get stuck waiting for each other at the same point in time (this would lead in a "deadlock").
However, when sending several messages using TCP, the boundaries of these messages get lost. For instance, if you use a TCP socket to transmit "Hello" and "Goodbye" as two separate messages, the receiver may interpret this as one single message: "HelloGoodbye". This is because all data transmitted using TCP get "serialized" into a single byte-stream.
We thus need to construct our messages in a way such that we can deserialize the byte-stream back to the original messages. We can for instance make use of a delimiting character of string. For instance, if we know that the character "|" can never be part our message, we can transmit "Hello", then "|", then "Goodbye" to make the remote end (who may thus receive "Hello|Goodbye" altogether) understand that those are two different messages. In this case, the role of "|" is that of a delimiter.
If there is no character that can act as a delimiter for our protocol, you may add headers containing meta-data about the following message. These headers can be then used by the other end to deserialize the messages.
To keep this test simple, we simply designed it in a two messages passing: first the size, then the content. But an issue may happen if the file sent starts with some digits. We thus propose you to add a simple delimiter character at the end of the size message.
Similarly we should have a way to delimit the end of the file (otherwise the next size may still be considered to be part of a former file). We propose you to add a simple delimiter string, e.g. "<EOF>"
.
Example
Server (in one terminal):
./tcp-test-server 6789
Server started on port 6789
Waiting for a size...
Received a size: 32 --> accepted
About to receive file of 32 bytes
Received a file:
Hello there!
How are you doing?
Waiting for a size...
...
Client (in another terminal):
./tcp-test-client 6789 ../provided/tests/data/hello_there.txt
Talking to 6789
Sending size 32:
Server responded: "Small file"
Sending ../provided/tests/data/hello_there.txt:
Accepted
Done
You can launch the client several times, with different files
(for instance ../provided/tests/data/aiw.txt
).
(Terminate the server with Ctrl-C.)
Use Wireshark to debug
Use Wireshark to debug your code.
Try many clients at the same time:
for i in $(seq 5); do ./tcp-test-client 6789 ../provided/tests/data/2047.txt > log-$i 2>&1 & done
What happens? (maybe nothing particular, actually)
-->
concurrent access will not be addressed at this layer but in the last week in the HTTP layer.
Simple HTTP layer
HTTP messages handler
In order to be generic (and be able to use our HTTP layer for other services than the one used in this project), we separate the handling of the content of the HTTP requests/services from the handling of the HTTP protocol itself.
This separation is done by passing a function, responsible for the handling of the content of the HTTP requests/services, to the initialization of the HTTP connection. Such a function is called a "HTTP messages handler".
To be able to pass it to the initialization function, we need a specific type: EventCallback
, to be defined in http_net.h
as a pointer to a function taking a pointer to struct http_message
and an int
as parameters, and returning an int
.
http_receive()
In a file http_net.c
(copy if from provided
; this file offers the API to a (simplified) generic HTTP server), the http_receive()
function is the main function to handle HTTP connections. But in order to prepare for multi-threaded version (last week), we recommend you to split it into two parts:
-
connects the socket with
tcp_accept()
(returnsERR_IO
in case of error); -
(if no error,) handles the connection through a tool function (we propose to name it
handle_connection()
).
Of course, most of the work now remains to be done in handle_connection()
.
For future compatibility, its signature has to be:
static void* handle_connection(void* arg)
In our case, it receives a pointer to an int
containing the socket file descriptor.
And it returns a pointer to an int
containing some error code (ERR_NONE
if none). This may seem far-fetched (why not receive and return an int
?), but this will be required when adding multi-threading. We provided two examples of how to handle that.
The handle_connection()
function:
-
reads the HTTP header from the socket into some buffer (max size of HTTP headers is provided in
MAX_HEADER_SIZE
fromhttp_net.h
); notice that this may require several call totcp_read()
: read as long as the headers do not containHTTP_HDR_END_DELIM
(and you didn't read more thanMAX_HEADER_SIZE
); you can usestrstr(3)
to findHTTP_HDR_END_DELIM
in the buffer; -
handles error cases;
-
sends the reply using
http_reply()
: if the headers contains"test: ok"
(usestrstr(3)
once again), use theHTTP_OK
status, otherwiseHTTP_BAD_REQUEST
; the other parameters can be empty; ifhttp_reply()
fails,handle_connection()
returns&our_ERR_IO
.
http_reply()
The http_reply()
function is a tool function to send a general reply a bit more complex than the above two, with some content.
-
allocates a buffer at the proper size (to be computed, read further);
-
starts filling this buffer with the header in the format:
HTTP_PROTOCOL_ID <status> HTTP_LINE_DELIM <headers> Content-Length: <body_len> HTTP_HDR_END_DELIM
where
<status>
,<headers>
and<body_len>
have to be replaced by the corresponding parameter values;for instance, the call
http_reply(1234, HTTP_OK, "Content-Type: text/html; charset=utf-8" HTTP_LINE_DELIM, buffer, 6789);
will create the header
"HTTP/1.1 200 OK\r\nContent-Type: text/html; charset=utf-8\r\nContent-Length: 6789\r\n\r\n"
-
then adds (copies) the body to the end of the buffer;
-
and send everything to the socket.
The body
parameter may be NULL
(as long as body_len
is 0). It is useful for responses with an empty body.
Tests with a very simple HTTP server
Use the provided http-test-server.c
to make some tests. Simply launch this server; and, as a client, use curl
:
curl -v localhost:8000
curl -H 'test: ok' -v localhost:8000
curl -H 'test: fail' -v localhost:8000
Simple ImgFS server
Main
The final step for this week is to create a simple version of our future HTTP server for ImgFS services.
This is separated over two files (copy them from provided
):
imgfs_server_service.c
, which implements the main functionalities needed by our server;imgfs_server.c
, which runs the server.
In imgfs_server_service.c
:
-
declared two
static
global variables, one to store the ImgFS file and another to store the port number (uint16_t
); -
define the function
server_startup()
, which receivesargc
andargv
, and:- checks to have at least one argument (which is the ImgFS file name);
- opens the ImgFS file (and handles errors, if any);
- prints the header of the (properly opened) ImgFS file;
- handles the optional second argument: if it's a valid port number, use it; otherwise use
DEFAULT_LISTENING_PORT
; - initializes the HTTP connection
- prints "ImgFS server started on http://localhost:" plus port number, if everything was ok.
In imgfs_server.c
:
- call
server_startup()
; - loop on
http_receive()
as long as there are no error (seehttp-test-server.c
for an example);
Signal handling for proper shutdown
Finally, we'd like to properly close the server. For this we will add a signal handler that will close the HTTP connection and the ImgFS file on server termination.
First of all, add a call to http_close()
into server_shutdown()
.
Then, to imgfs_server.c
:
-
add the function
static void signal_handler(int sig _unused)
which simply callsserver_shutdown()
, then stops the program usingexit(0)
; -
and call
set_signal_handler()
from themain()
.
Try it by sending a Ctrl-C to a running server.
Tests
You can test your new server with the same curl
commands as above. Test different port numbers.
There is no other "end-to-end" test for this week (except the self-made, mentioned in this handout) since we did not finish the implementation of a "final product".
Similarly, there is no unit-test, since we don't really have independent tool functions this week.