CS270 -- Rich's Hints for DIY File System Phase 1

Rich Wolski --- Fall, 2023

this page last updated: Wed Sep 6 11:31:54 PDT 2023


Roadmap

The goal of Phase 1 is to establish a familiarity with the technologies that are necessary to complete Phase 2. These technologies are Eucalyptus, the Linux package-manager and configuration system, FUSE, and a little bit of Linux systems programming. This document is intended to be a guide but not a tutorial. That is, it expects that you will be using the typical documentation and Internet information outlets to familiarize yourself with the details.

Requirements and Style

When faced with a project that requires an amalgamation of disparate technologies, there are couple of different stylistic approaches that seem to be effective (at least in my experience). The first is to review all of them to make sure you understand what they do and how they might work together. In this case, the technologies are Taken together, this is your "stack" for the project. You can start out by planning how to integrate the various components of your stack and then start to develop according to this plan.

The other approach is to start at the bottom of "the stack" and to incrementally add functionality. I typically use this approach when I haven't use a technology much so I'll use this style for the remainder of this guide.

The Approach

The incremental approach I used to complete Phase 1 consisted, roughly, of the following steps.

Starting a VM in Eucalyptus

I used the Eucalyptus tutorial and the AWS CLI to start an instance. The only choice here is of the distro to use. I used CentOS 9.1 Stream which is
IMAGE	ami-f428965264d32a389	000494469007/CentOSStream-9.1.ebs	000494469007	available	public	x86_64	machine				ebs	hvm
in the campus Eucalyptus cloud. You are free to use what ever distro you like that is installed on the cloud. The remainder of this guide will assume CentOS 9, however. The choice of distro (and sometimes the choice of version) affects the specific commands you will need to accomplish the various steps.

Install FUSE and Dependencies

FUSE is used pretty heavily by Docker so it is already packaged for many distros. However, to develop with FUSE, you need the development header files and libraries. Also, you need to make a choice regarding what language and runtime system you plan to use. The basic FUSE interface and the documentation for LibFUSE are all for the C language. However, there are C++, Python, Go, and Rust bindings if you decide not to use C. My experience with these bindings is sketchy so I used the native C interface.

To install the necessary FUSE dependencies on CentOS 9, as the root user, I ran

dnf -y update
dnf -y install gcc g++ fuse3-devel autoconf git gdb
which gets the basic C development tools as well. Note that the initial "cloud-user" has sudo privileges in the CentOS 9 image. You probably want to do a little reading up on how to run these tools either as the root user or with sudo if this is not familiar to you.

Install and Test FUSE Hello World

The LibFUSE repository on github has several helpful examples, including a minimalist "hello world" example. I cloned the repo and followed the build instructions for hello world. You will to read through the documentation to understand what you have built, how to start it, and how to stop it. You might also review the documentation on the Linux mount command. FUSE does something similar, but in a different way. This is a good opportunity to hone your search skills with respect to understanding how to use FUSE. At the end of this step, though, the "hello world" example should work and you should know how to unmount the FUSE file system that it implements.

Modify the Hello World Example

The "hello world" example is a good "template" for building a FUSE daemon (the process FUSE will launch for you) and for implementing your call-backs. Take a look at the code and you will see that it records a file name and a string (which can be passed as command line arguments) in an in-memory data structure (look for "options"). When you run it and you type ls in the top level directory for the FUSE file system, you will see a file listed. If you run the Linux utility cat you should see the string. It is as if FUSE created a file with that string in it but -- really -- it just a process responding to file system calls that Linux makes as if there was a file in the file system.

At this point, you might find it instructive to run the hello world example with debugging turned on (read the FUSE documentation to see how to do that). When debugging is enabled, FUSE will print a trace of all of the Linux file system calls that the shell (bash in this case) makes when you access the "fictitious" file that the hello world example is implementing.

Of particular interest is that the example only allows you to read the file. Any attempt to write the file will fail. That is, the example emulates a read-only file with a single string in it. For your Phase 1 assignment, you will need to be able to write the file (I will not test being able to read it).

Thus, the simplest thing to do here is to add to this example the ability to change the string in the file by calling the Linux write() file system call. You will need to consult the LibFUSE documentation to understand how to add a call-back for Linux write(). You will also want to write a test code that opens the file and calls write() on it with a string you specify. My test for Phase 1 will essentially do that.

Install and Test a Linux mail Client

The next step is to make sure that (as the root user) you can send mail to a user at UCSB from the command line. Every distro (and usually every version of every distro) has a different preferred command-line mail client. For CentOS 9, the preferred client is s-nail so for me this step was to install and configure s-nail so that I could execute the following line from the command line as the root user
echo "This is a test" | mail -s "CS270 testing" rich@cs.ucsb.edu
On CentOS 9, getting this work also required the installation and configuration of sendmail. You will need to figure out, for your distro, what you need to do to get this to work.

Also, PLEASE DO NOT USE MY MAIL ADDRESS DURING TESTING. Substitute your UCSB email address for mine when you are testing. Then, before you turn in your solution, change the email address to be mine so that when I run your code, I will get an email.

Integrate the Mail Client

At this point, you have a way to read and write a "fictitious" file and a way to send mail to a UCSB email address from the Linux command line. The next step is to integrate the two so that when a test code opens a file and writes a string into it, you send the string to the email address you are using as a target. To do so, you need to make two changes to your FUSE program.

The first is to allow the user code to specify the name of the file using the Linux open() file system call. The Hello Word example doesn't use the argument passed to open() to specify the file name. It sets the "options" structure from a parameter passed when the program is initialized (and includes a default if nothing is specified). Change this to set the file name specified in open. You will need to understand the contents of the "path" parameter passed by FUSE when it makes the call back for open().

The second change to make is to alter the write() call-back so that instead of changing the string that your FUSE process remembers (in "options"), you send the string to the mail client as the content of a message that will be mailed to the target recipient. This functionality is most easily implemented using a "shell out" which is a facility available in most language runtimes that allows a calling program to send a command line (and some parameters) to a shell and to wait for the command to complete. In C, one (but by no means the only) way to do that is to use the Linux popen() call. You will want to do a little reading to determine what the best way is to invoke the Linux mail client of your choosing with a specific target recipient and to send it a string that it will mail to that recipient.

There is a question about how to specify the recipient. As I mentioned, during testing, it should not be me. The two easiest options are either to hard code it into your FUSE program (and change it to my email address just before you turn in your solution) or to add an option to the "options" structure in the original Hello World FUSE example program (and to document how it is I should set it when I run your solution).

Write Up your Recipe

Once you have this working, the next step is to write down a series of steps that I can "cut and paste" from your document so that I wind up with an email when I start "from scratch." If you used the Eucalyptus console to start the instance, give me the ami and the instance type you used. You should include every command that is necessary. Check out the bash history command to see what commands you have been typing. You might also consider using the Linux script command to capture your key strokes so that you have a record of what you did.

Start from Scratch and Test your Recipe

Lastly, you should pretend that you are grading someone else's assignment. You will need a test code that opens a file and writes a string into it (I have one that I will use). Start from scratch and follow your recipe exactly, being as precise as possible (using an email address other than mine). If anything fails and need to be "redone" or changed (even slightly) do not assume that I will "know" what you meant or that I can "figure it out." Instead, write it down, change the recipe, terminate your VM, and start over.

When you get through the recipe and your email recipient receives an email, once you change the email recipient to be me (or your recipe specifies how I indicate that I want to receive the email), you are finished and ready to turn in your Phase 1.