this page last updated: Thu Sep 7 13:21:00 PDT 2023
This goal is, indeed, achievable. However, the full Linux file system interface is quite extensive. In addition to the POSIX interface, there is a veritable zoo of features that the file system calls implement. We will only be testing a subset of these features. Further, you are free to design your file system in any way you choose. If this is your first up-close encounter with a file system, however, or if you are having trouble understanding how the pieces all fit together, this document will provide one possible roadmap for the project. It represents, more or less, how I implemented it. You need not consider a prescription. Rather, if you don't have a strong feeling about how to proceed, you might consult this text as I know a design and implementation that follows it will result in a working file system.
For example, it is fine to design your data structures so that there is only one file system of your type mounted at a time. If you were building this file system for a real OS, you'd need to handle having multiple file systems mounted simultaneously. Feel free to design for the more general case, but it is not necessary.
The other way to look at the requirements for this project is to ask "what must my file system do?" At the end of the quarter, I will ask you to add my ssh public key to your instance and for you to start up your file system using a single mount point. As the root user, I will install several test routines by copying them into your file system through this mount point and I will run the routines. They will both stress test your implementation and record some performance stats.
I will also ask you to demo any cool features or features of which your are particularly proud.
And that's it. The goals (in order of importance are) first to enjoy the process, second, not to have your file system crash or corrupt the storage, and third to make your file system performant.
By way of style, it has been my experience that building this type of system is best accomplished using two basic principles.
Then, once you have your file system working with memory buffers, you need only rewrite this layer to use the raw disk rather than an in-memory buffer in block-sized units. Thus, henceforth I will refer to "on disk" as operations that go through layer 0 (which will eventually read and write a disk).
Your test routines should verify that you can access all of the blocks on disk individually and that there is no corruption (e.g. due to a miscalculation resulting in overlap) in the blocks.
You test codes for Layer 2 should be able to make directories and files. They should follow the correct creation semantics (e.g. a mknod fails if it specifies a path that contains non-existent directories). You should test file reads/writes that use direct blocks in your inodes, indirect blocks, and double indirect blocks. You should also make sure that files get deleted properly and that the free lists look reasonable as blocks and inodes are allocated and released.
nameiis some Unix implementations) in each call. It is also possible to get FUSE to pass back a file info data structure in which you can store your own information (e.g. the inode number) for subsequent calls. You are free to use this facility if you so choose. Using namei each time means that each call will get the true conversion to an inode but it will be slower than it needs to be. You might start with the namei approach and then see if using FUSE to pass back the inode number when it can improves performance.
Also, the debugger is most helpful for development at this layer. There isn't much documentation that explains exactly what comes across the FUSE interface in gory detail. It is instructive to write stubs at layer 3 and to set breakpoints (using the debugger) in the stubs just to see what FUSE was passing into my code.
Testing at this stage involves mounting a small file system and using Linux to test it out. Consider writing test routines that use ascii text since it is easy to use the shell with such tests, and it is also easy to spot corrupted files. While the file system is small (it must be able to fit in memory) all of the "standard" file operations should work when your tests are complete.
At this stage, you should have a working file system that uses FUSE and an in-memory buffer as the disk store. You can pretty much get all of the system calls to work. The only restriction is that the sizes will need to be pretty restrictive. Considering using a small block size and small constants to test everything and then moving to implement stress tests. The larger sizes possible with a real disk may expose some sizing bugs.
/dev. Launch an instance in Eucalyptus, create a volume, and attach it to the instance. The new device can be accessed like a file through the
/deventry.
For example, if the attached volume is
/dev/vdbthen
Rewrite Layer 0 and rerun your tests with a file system that is at least 2 GB. Then try formatting and mounting a 30GB file system and test.
However, I will be grading your Phase 2 so you might legitimately ask "what do I need to do to get full credit?" The answer to that question is that you need to make sure you implement and test more Linux functionality than I will be able to test during your final presentation period. You won't know what I will test (although you do know it won't take more than about 5 minutes) so you need to make sure that your implementation is as complete as possible. What this means is that you should be writing tests through out your development and then, when you think it is finished, you should write more tests, each of which is designed to exercise some feature of the file system. In short, you need to test your system more exhaustively than I will test it.
The reason that the project is evaluated this way is because operating system developers can never anticipate (or even see, it almost all cases) what users are doing to "test" the functionality of the OS. Thus, OS development teams must doing as extensive testing as they can manage ahead of a release date so that they can expose and fix as many bugs as they can before users encounter them.
In this class, though, it is not reasonable to ask you to implement the full Linux file-system interface. You might ask "what parts of the interface are fair game?"
Here are a few hints regarding features that I will not test and also features that I might test.
I will also test your file system using "standard" Linux system utilities and tools. Examples include, but are not limited to, the various language compilers (gcc, g++, gfortran, etc.), git, make, bash, grep, awk, sed, ls, and find. You should consider writing tests that use these utilities to access files on your file system as well. At this point you should also write stress tests that do lots of operations with different sizes and offsets to make sure that your file system doesn't have a latent bug or two.