tags:

views:

106

answers:

3

Hi all, I 've a question related to design of file systems. These days we are seeing the proliferation of many file systems mostly related to handling large datasets and providing high availability and speed.

I would like to know that from a file system designer and developer standpoint, how do we evaluate the performance and availability of our file-system? Are there any benchmarks that we can run? How do we test the code that is written to create a file system design?

If I want to just write a distributed file system for academic purposes, would it be mandatory for me to have multiple disks or can I fake that effect somehow?

Thanks

Ajay

A: 

You can perform benchmark tests using a tool like IOZone. Performance benchmarks only tell part of the story, though. Do you need journaling, replication, etc? You might get worse performace in a benchmark but have additional features that are essential to your needs. Wikipedia has a decent comparison of some filesystem features.

John Paulett
Thanks John!Any good books from which I can learn about the whole process of designing, developing and evaluating file systems in detail?
ajay
+1  A: 

There are many benchmarks on file systems that are available. For example: Ext3 vs ReiserFS

My advice to you: Take a look at the Linux kernel. It has many drivers on how file systems are designed and made. Also with Linux, you do have the ability to simulate a file system by using a loop back (a file system mounted in a file).

Another option that you have is that if you are going to create a partition on a raw disk, you may want to develop that within a virtual machine, so that you do no have to have to buy new equipment or cause damage to your development machine.

monksy
Thanks Steven!Any suggestions if I want to start writing my own file-system from a learning standpoint. Should I start with sometime like GFS or first start out to build a simpler file-system, evaluate it, benchmark it and then move further to more modern and complex file systems.Any good books that can help me out in learning how to design and devlop file systems?
ajay
I've got a book called "Linux File systems." From my look through it appears to explain the Linux code for the systems. As with anything complicated subject start small and then branch out.
monksy
If you want to start "big" look at ISO9660. It is the file system for CDs. Also, you can find many images of this online, and they are easy to mount on linux. Like I said I would recommend that you start small.
monksy
A: 

As I said here, I really recommand the journal paper by Brook University and IBM Watson Labs in the "Transaction of Storage" about file system benchmarking, in which they present different benchmarks and their strong and weak points: A nine year study of file system and storage benchmarking: A nine year study of file system and storage benchmarking.

They give lots of advise how to benchmark a filesystem benchmark. It is not an easy task to do it right.

I would say: It is better with multiple disks and multiple machines otherwise I as reviewer would probably have doubt about your evaluation. I really know the problem: I myself have only a few nodes and a few dozen disks available for my research. There are disk simulators, e.g. DiskSim, that maybe can be used for fake disks, but in a distributed setting you probably have to fake the other components (networking, locking, too). It maybe isn't impossible -- simulations are often used on other distributed settings, e.g. sensor networks -- but it is hard to do it in a rigorous way.

Edited: Related to books: There isn't any really good book, but here is a short list:

A lot of storage courses at storage research labs recommend NFS Illustrated (2000).

P.S. And please! Stop moving everything to serverfault where the title contains "file systems"!

dmeister