Home

metafs

Updated:
Created:

writing a small fs that can be used to index content, have proper links between objects etc.

After looking at llfuse, and seeing that its too low level, I checked python-fuse again, and found what I needed, the release and flush methods. This means I should be able to index files once they are written, and do other magical things. What do I want?

  1. (re)index upon modification of file contents. This needs to be done in async fashion. Should only be done if there is no later unlink or write operation on the file (E.g. if indexing lags behind)
  2. preview generation for known filetypes.
  3. uid for files
  4. symlinks that link to uid, and move with the target
  5. tags on files
  6. xattr on files. xattr (besides uid) should be copied on cp.
  7. doing comples searches outside the fs.api in command line tools (all files with those tags below this tree with these fulltext words
  8. directed typed links between files, bidirectional search interface. Links based on uid of course. Maybe even have the links as file objects in the fs, so one could add/modify those links more easily.
  9. hasing/fingerprinting the contents, so that duplicates can be found easily

It might make sense to store all the metadata in a database, and use the fs api (especially xattr) to query the database for the metadata. This way the database can be used for more comples queries, while still being able to use the standard command line tools.

Ideally the fs would be good enough (leaving performance aside for now) to run below a samba server or nfs server, allowing to add all the good features without having to modify or just configure the servers at all.

It seems that it makes more sense to store the uid->filename bits as part of the file entry in an sql database -> the read lookup should be fairly cheap (e.g walking up the tree would be quite costly). Also searches like 'everything below this tree' should work.

Another question is where to store the uid - putting it into a real xattr might be safer, but if I can trust the metafs, it should be able to trace all move, rename and other operations just fine. E.g. when renaming or moving a directory, all entries below that directory need to be modified as well. This would of course allow to use our metafs on filesystem backends that don't have xattr themselfes. One example would be linking to and from DVDs, or vfat fs e.g. on the n900 mobile phone.

Which also leads to ideally store a rootuid property to identify the filesystem, and allow cross filesystem links. But that might be for a second step.