mgree / ffs Goto Github PK
View Code? Open in Web Editor NEWthe file filesystem: mount semi-structured data (like JSON) as a Unix filesystem
Home Page: https://mgree.github.io/ffs/
License: GNU General Public License v3.0
the file filesystem: mount semi-structured data (like JSON) as a Unix filesystem
Home Page: https://mgree.github.io/ffs/
License: GNU General Public License v3.0
We have a lot of sleep
calls that could be faster.
Write a tool to detect when a mount is ready. (Current benchmarks do this by busy looping on umount
.) Use its exit status to determine when it's okay to proceed on the mount and proceed checking that process has completed.
It'll also be useful to have a way to pad a few milliseconds on, i.e., unmount means the process may not be done yet. The current utils/timeout
script is okay, but (a) it's noisy, and (b) it'd be nice to not fix it on ffs
when we start. Maybe a waitfor
command that takes a time limit before killing the process?
Rather than taking a mountpoint as an argument, use an optional --mount
flag.
If the input is over STDIN, you have to specify a --mount
.
If the input is from a file foo.EXT
, then try to make a directory foo
. If that file already exists, give up and tell the user about --mount
. If it doesn't, great: do that. When things unmount, double check the directory is empty and then remove it.
Need to revise all the tests, and then test a few configurations of this feature.
When running on the citylots JSON file, parsing takes a long, long time. Way longer than I expected---since I was told to "expect in the ballpark of 500 to 1000 megabytes per second deserialization". I suspect I'm doing something wrong.
Relatedly, lazy loading only saves time on loading, not parsing. Can we use serde_json::RawValue and similar tricks to parse but defer construction? I'm not sure every format supports this.
Keep a free list, and reuse inode numbers.
An idea from @angelhof: notice directories where every file has numerical name suffixes (e.g., role0
, role1
, or simpliciter 0
, 1
, 2
, ...) and automatically treat them as lists.
The best way to do this is to use the auto
type for directories, too, and new directories have that type assigned.
Would be good to build a .deb
file for this.
https://github.com/mmstick/cargo-deb seems promising.
Make a little doc/gh-pages directory. Include one of those cute screencast GIFs.
Just load/unload data directly on the filesystem.
Progress indicator. https://docs.rs/indicatif/0.16.2/indicatif/
Multithreading? (Is that ever a performance win?)
BFS?
Flag to follow symlinks, possibly even out of the root.
-i flag for prompting on weirdness.
A --live
flag to sync the JSON out on every write.
A --watch
flag to notice updates to the JSON file and rebuild the FS.
With lazy/incremental, this would be complicated.
Trying to mount a JSON file within the virtual filesystem fails (which is ok, I guess):
raphael@tukk ~/Downloads> ./ffs -i ffs1.json
raphael@tukk ~/Downloads> cat ffs1.json
{"option":"test"}
raphael@tukk ~/Downloads> cp ffs1.json ffs1/
raphael@tukk ~/Downloads> ls ffs1/
ffs1.json option
raphael@tukk ~/Downloads> ./ffs -i ffs1/ffs1.json
fusermount: bad mount point ffs1/ffs1: Permission denied
thread 'main' panicked at 'calledResult::unwrap()
on anErr
value: Os { code: 2, kind: NotFound, message: "No such file or
directory" }', src/main.rs:38:41
note: run withRUST_BACKTRACE=1
environment variable to display a
backtrace
This actually creates the subdirectory ffs1/ffs1
but does not mount anything there. Unmounting the filesystem results in an additional entry "ffs1: {}" in the json file. I'd suggest that a failed command should not leave behind any new files/members.
Started working in https://github.com/mgree/ffs/tree/homebrew.
Really, no idea what I'm doing. Not sure how to depend on a cask (macfuse) and I have the sinking feeling that you can't. Oy.
then saved backed into JSON as "space" ๐๐ป
There are plenty of interesting options on https://man7.org/linux/man-pages/man8/mount.fuse3.8.html.
Request: https://twitter.com/wtfpdf/status/1409883313888804868
Promising lead on a Rust library: https://github.com/J-F-Liu/lopdf with Document::load and Document.*
$> touch x.json
$> ffs x.son
thread 'main' panicked at 'JSON: Error("EOF while parsing a value", line: 1, column: 0)', src/format.rs:187:53
note: run with RUST_BACKTRACE=1
environment variable to display a backtrace
Investigate weird unmounts on macOS, producing:
WARN fuser::mnt::fuse2: umount failed with Os { code: 22, kind: InvalidInput, message: "Invalid argument" }
See actions log. Linux seems fine.
๐ข
Lazy loading (scan but don't parse into structures until demanded). #50
Ahead-of-time mappings (literally build/unbuild FS).
Use 9p, per Jay McCarthy. Or use the Bento FS thing?
Timing: use https://crates.io/crates/tracing-timing?
Microbenchmarks
Macrobenchmarks
Compare against
For example i3blocks supports multiple signals which you can trigger using pkill -SIGRTMIN+10 i3blocks
, which then causes an action.
Could you add atleast two signals, one for re-reading INPUT file and another for writing OUTPUT file.
This would open ffs for uses in scripts and would allow updating INPUT and/or OUTPUT files without needing to restart ffs.
The JSON object { ".": 5 }
will create a file named dot
holding the number 5
... but it will dump back to JSON without restoring the name.
The best solution is probably to track an "unmunged" name, and files that don't undergo a rename
call go back exactly how they were.
This unmunged name is metadata, so the appropriate mechanism for working with it is up in the air (see #2).
Several format renderers can put the output in a nice, human-readable format. Why not treat yourself? We all deserve nice things.
Per cberner/fuser#153 (comment), better to rely on the explicit contract than on the the current implementation in fuser calling mem::drop
.
On a malformed JSON file:
: mgree@rocinante:~/talks/20211025plos [master] ; ~/ffs/target/release/ffs -i demo.json
thread 'main' panicked at 'JSON: Error("expected value", line: 7, column: 12)', src/format.rs:322:45
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
https://twitter.com/laurencetratt/status/1451198313085030407 seems to indicate a way forward. OO!
Saves you some trouble (e.g., passing {}
in on STDIN).
Depends on #2 in part.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.