ImgFS: Image-oriented File System --- Finalizing `read` and `insert`

Introduction

This week, you'll implement the commands read (extract an image from the image database) and insert (insert an image into the database). To do this, you'll use features developed last week.

Warning: if you're working ahead on the submission, don't forget to make a make submit before "polluting" your first rendering with early work for this week (which shall not be part of the part submitted for grading).
If in doubt/difficulty, ask an assistant (or teacher).

Provided material

As in the previous weeks, we're providing you with test material for this week.

Moreover, in order to reduce the workload, we also provide you with four functions to be added in your code. These functions can be found in provided/src/week10_provided_code.txt and consist in:

  • resolution_atoi(), to be added to imgfs_tools.c;

    the purpose of this function is to transform a string specifying an image resolution into one of the enumerations specifying a resolution type, namely:

    • return THUMB_RES if the argument is either "thumb" or "thumbnail";
    • return SMALL_RES if the argument is "small";
    • return ORIG_RES if argument is either "orig" or "original";
    • return -1 in all other cases, including if the argument is NULL;

    this function is needed to process the command line arguments of the read command;

  • get_resolution(), to be added to image_content.c;

    the purpose of this function is to retrieve the resolution of a JPEG image; it takes an image_buffer as input, which is a pointer to a memory region containing a JPEG image, and image_size which is the size of this region; and it "outputs" (fills, actually) the two parameters height and width;

    the function returns ERR_NONE if there is no problem, or ERR_IMGLIB if there is a VIPS error;

    this function is needed by the insert command;

    Note for those who look at the provided code and may be puzzled by the cast: the prototype of vips_jpegload_buffer() is in fact wrong, as its first argument should be const void* (instead of void *; we read from these data!). In fact, if you take a look at their code, that function only calls vips_blob_new() whose second argument is correctly qualified as const void * (well, even if there is that horrible casting). So we can safely pass a const image_buffer to vips_jpegload_buffer() (by casting it, unfortunately... :-();

  • the two do_read_cmd() and do_insert_cmd(), to be added to imgfscmd_functions.c; they do not compile as such yet since they require three utility functions (to be written, see below).

Tasks

What you have to do this week, is to implement the do_read() and do_insert() functions, as well as three utility functions for do_read_cmd() and do_insert_cmd().

do_insert()

The do_insert() function adds an image to the "imgFS". Create a new imgfs_insert.c file to implement it.

The implementation logic contains several steps, in an order that must be respected.

Find a free position in the index

First of all, check that the current number of images is less than max_files. Return ERR_IMGFS_FULL if this is not the case.

Next, you have to find an empty entry in the metadata table. When this is the case, you must:

  • place the image's SHA256 hash value in the SHA field (if necessary, review what you did in the warmup regarding SHA256 computation);
  • copy the img_id string into the corresponding field;
  • store the image size (passed as parameter) in the corresponding ORIG_RES field (beware of type change);
  • use the get_resolution() function (see above) to determine the image width and height; put these values into the orig_res fields of the metadata.

Image de-duplication

Call last week's do_name_and_content_dedup() function using the correct parameters. In the event of an error, do_insert() returns the same error code.

Writing the image to disk

First, check whether the de-duplication step has found (or not) another copy of the same image. To do this, test whether the original resolution offset is zero (if necessary, review the do_name_and_content_dedup() function).

If the image does not exist, write its contents at the end of the file. Don't forget to finish initializing the metadata.

Updating image database data

Update all the necessary image database header fields. Version shall be inscreased by 1.

Finally, all that's left to do is write the header, and then the corresponding metadata entry to disk (your code must not write all the metadata to disk for each operation!).

do_read()

The second main function of the week is do_read(), to be implemented in a new file, imgfs_read.c.

This function must first find the entry in the metadata table corresponding to the supplied identifier.

If successful, determine whether the image already exists in the requested resolution (offset or size null). If not, call the lazily_resize() function from last week to create the image at the required resolution. (Note: this should never be the case for ORIG_RES).

At this point, the position of the image (in the correct resolution) in the file is known, as is its size; you can then read the contents of the file image into a dynamically allocated memory region.

If successful, the output parameters image_buffer and image_size should contain the memory address and size of the image.

Be careful to handle possible error cases:

  • return the error code received in the event of an internal function error;
  • return ERR_IO in the event of a read error;
  • return ERR_OUT_OF_MEMORY in the event of a memory allocation error;
  • and return ERR_IMAGE_NOT_FOUND if the requested identifier could not be found.

Note: in case any of you are wondering: please note that read on a duplicated image does not modify any of its duplicates. Indeed, lazily_resize() has no impact on other images than the one under consideration (and was written before do_name_and_content_dedup()). Such a behavior (which must be your program's behavior) isn't a big deal in practice, because:

  1. searching for all duplicates at each operation would be too costly in general;
  2. it will be the role of a garbage collector to do this kind of work, once in a while (e.g. every night) and on all the image database;
  3. and it should often be the case that a duplicated image is, when it arrives, the duplicate of an image already in use, i.e. an image that already has its "small" and "thumb" versions; in this case, the new inserted image (duplicate) will already share its "small" and "thumb" versions with its original, already present in the database.

Integration into the main program

We already provided you with the two wrap-up functions do_read_cmd() and do_insert_cmd() (see top of this handout), but they still require three utility functions (in imgfscmd_functions.c):

static void create_name(const char* img_id, int resolution, char** new_name);
static int write_disk_image(const char *filename, const char *image_buffer, uint32_t image_size);
static int read_disk_image(const char *path, char **image_buffer, uint32_t *image_size);

The purpose of create_name() is to create, in new_name the name of the file to use to save the read image (do_read_cmd()), using the following naming convention:

image_id + resolution_suffix + '.jpg'

where:

  • image_id is the image identifier;
  • and resolution_suffix corresponds to _orig, _small or _thumb;

for instance, if the image id is "myid" and the resolution is SMALL_RES, then new_name will contain "myid_small.jpg".
Also have a look at its call for further details if needed.

write_disk_image() is a very simple tool function (five lines or so) to write the content of the provided image_buffer, the size of which is image_size, to a file, the name of which is provided. Have a look at its call for further details if needed.

This function returns ERR_IO on error and ERR_NONE otherwise.

Finaly, read_disk_image() reads an image from disk, the filename of which is provided in path. It reads the image into image_buffer and sets image_size to its corresponding size.

This function returns ERR_IO in case of a filesystem error, ERR_OUT_OF_MEMORY in case of a memory allocation error, and ERR_NONE otherwise.

Update help

Finaly, modify the help command to reflect the new commands:

> ./imgfscmd help
imgfscmd [COMMAND] [ARGUMENTS]
  help: displays this help.
  list <imgFS_filename>: list imgFS content.
  create <imgFS_filename> [options]: create a new imgFS.
      options are:
          -max_files <MAX_FILES>: maximum number of files.
                                  default value is 128
                                  maximum value is 4294967295
          -thumb_res <X_RES> <Y_RES>: resolution for thumbnail images.
                                  default value is 64x64
                                  maximum value is 128x128
          -small_res <X_RES> <Y_RES>: resolution for small images.
                                  default value is 256x256
                                  maximum value is 512x512
  read   <imgFS_filename> <imgID> [original|orig|thumbnail|thumb|small]:
      read an image from the imgFS and save it to a file.
      default resolution is "original".
  insert <imgFS_filename> <imgID> <filename>: insert a new image in the imgFS.
  delete <imgFS_filename> <imgID>: delete image imgID from imgFS.