mgree / ffs Goto Github PK
View Code? Open in Web Editor NEWthe file filesystem: mount semi-structured data (like JSON) as a Unix filesystem
Home Page: https://mgree.github.io/ffs/
License: GNU General Public License v3.0
the file filesystem: mount semi-structured data (like JSON) as a Unix filesystem
Home Page: https://mgree.github.io/ffs/
License: GNU General Public License v3.0
Timing: use https://crates.io/crates/tracing-timing?
Microbenchmarks
Macrobenchmarks
Compare against
Per cberner/fuser#153 (comment), better to rely on the explicit contract than on the the current implementation in fuser calling mem::drop
.
Depends on #2 in part.
https://twitter.com/laurencetratt/status/1451198313085030407 seems to indicate a way forward. OO!
Lazy loading (scan but don't parse into structures until demanded). #50
Ahead-of-time mappings (literally build/unbuild FS).
Use 9p, per Jay McCarthy. Or use the Bento FS thing?
A --live
flag to sync the JSON out on every write.
A --watch
flag to notice updates to the JSON file and rebuild the FS.
With lazy/incremental, this would be complicated.
Rather than taking a mountpoint as an argument, use an optional --mount
flag.
If the input is over STDIN, you have to specify a --mount
.
If the input is from a file foo.EXT
, then try to make a directory foo
. If that file already exists, give up and tell the user about --mount
. If it doesn't, great: do that. When things unmount, double check the directory is empty and then remove it.
Need to revise all the tests, and then test a few configurations of this feature.
Would be good to build a .deb
file for this.
https://github.com/mmstick/cargo-deb seems promising.
An idea from @angelhof: notice directories where every file has numerical name suffixes (e.g., role0
, role1
, or simpliciter 0
, 1
, 2
, ...) and automatically treat them as lists.
The best way to do this is to use the auto
type for directories, too, and new directories have that type assigned.
$> touch x.json
$> ffs x.son
thread 'main' panicked at 'JSON: Error("EOF while parsing a value", line: 1, column: 0)', src/format.rs:187:53
note: run with RUST_BACKTRACE=1
environment variable to display a backtrace
Several format renderers can put the output in a nice, human-readable format. Why not treat yourself? We all deserve nice things.
On a malformed JSON file:
: mgree@rocinante:~/talks/20211025plos [master] ; ~/ffs/target/release/ffs -i demo.json
thread 'main' panicked at 'JSON: Error("expected value", line: 7, column: 12)', src/format.rs:322:45
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Keep a free list, and reuse inode numbers.
Request: https://twitter.com/wtfpdf/status/1409883313888804868
Promising lead on a Rust library: https://github.com/J-F-Liu/lopdf with Document::load and Document.*
Saves you some trouble (e.g., passing {}
in on STDIN).
๐ข
Investigate weird unmounts on macOS, producing:
WARN fuser::mnt::fuse2: umount failed with Os { code: 22, kind: InvalidInput, message: "Invalid argument" }
See actions log. Linux seems fine.
For example i3blocks supports multiple signals which you can trigger using pkill -SIGRTMIN+10 i3blocks
, which then causes an action.
Could you add atleast two signals, one for re-reading INPUT file and another for writing OUTPUT file.
This would open ffs for uses in scripts and would allow updating INPUT and/or OUTPUT files without needing to restart ffs.
The move to macos-latest
in #71 will have us generating ARM executables only. We should generate both ARM and x86 for macOS.
Consider the following TOML:
[a]
array = [ "a", 2, { x = "y", z = 3 } ]
key1 = "value1"
key2 = "value2"
It passes validation at https://www.toml-lint.com even tho it's slightly unconventional.
When I do the following
โฏ ffs -i x.toml&
[1] 469278
โฏ tree x
x
โโโ a
โโโ array
โย ย โโโ 0
โย ย โโโ 1
โย ย โโโ 2
โย ย โโโ x
โย ย โโโ z
โโโ key1
โโโ key2
4 directories, 6 files
it seems valid.
All the data is valid too:
โฏ cat x/a/array/0
a
โฏ cat x/a/array/1
2
โฏ cat x/a/array/2/x
y
โฏ cat x/a/array/2/z
3
โฏ cat x/a/key1
value1
โฏ cat x/a/key2
value2
However when I unmount it, it's completely broken:
โฏ umount x
[1] + 469278 done ffs -i x.toml
โฏ cat x.toml
[a]
key1 = "value1"
key2 = "value2"
array = ["a", 2
[[a.array]]
x = "y"
z = 3
]
The JSON object { ".": 5 }
will create a file named dot
holding the number 5
... but it will dump back to JSON without restoring the name.
The best solution is probably to track an "unmunged" name, and files that don't undergo a rename
call go back exactly how they were.
This unmunged name is metadata, so the appropriate mechanism for working with it is up in the air (see #2).
Make a little doc/gh-pages directory. Include one of those cute screencast GIFs.
We have a lot of sleep
calls that could be faster.
Write a tool to detect when a mount is ready. (Current benchmarks do this by busy looping on umount
.) Use its exit status to determine when it's okay to proceed on the mount and proceed checking that process has completed.
It'll also be useful to have a way to pad a few milliseconds on, i.e., unmount means the process may not be done yet. The current utils/timeout
script is okay, but (a) it's noisy, and (b) it'd be nice to not fix it on ffs
when we start. Maybe a waitfor
command that takes a time limit before killing the process?
Trying to mount a JSON file within the virtual filesystem fails (which is ok, I guess):
raphael@tukk ~/Downloads> ./ffs -i ffs1.json
raphael@tukk ~/Downloads> cat ffs1.json
{"option":"test"}
raphael@tukk ~/Downloads> cp ffs1.json ffs1/
raphael@tukk ~/Downloads> ls ffs1/
ffs1.json option
raphael@tukk ~/Downloads> ./ffs -i ffs1/ffs1.json
fusermount: bad mount point ffs1/ffs1: Permission denied
thread 'main' panicked at 'calledResult::unwrap()
on anErr
value: Os { code: 2, kind: NotFound, message: "No such file or
directory" }', src/main.rs:38:41
note: run withRUST_BACKTRACE=1
environment variable to display a
backtrace
This actually creates the subdirectory ffs1/ffs1
but does not mount anything there. Unmounting the filesystem results in an additional entry "ffs1: {}" in the json file. I'd suggest that a failed command should not leave behind any new files/members.
There are plenty of interesting options on https://man7.org/linux/man-pages/man8/mount.fuse3.8.html.
When running on the citylots JSON file, parsing takes a long, long time. Way longer than I expected---since I was told to "expect in the ballpark of 500 to 1000 megabytes per second deserialization". I suspect I'm doing something wrong.
Relatedly, lazy loading only saves time on loading, not parsing. Can we use serde_json::RawValue and similar tricks to parse but defer construction? I'm not sure every format supports this.
Started working in https://github.com/mgree/ffs/tree/homebrew.
Really, no idea what I'm doing. Not sure how to depend on a cask (macfuse) and I have the sinking feeling that you can't. Oy.
Just load/unload data directly on the filesystem.
Progress indicator. https://docs.rs/indicatif/0.16.2/indicatif/
Multithreading? (Is that ever a performance win?)
BFS?
Flag to follow symlinks, possibly even out of the root.
-i flag for prompting on weirdness.
I thought I might be able to create a JSON array, but whatever I do, it always generates an object with null values for me.
then saved backed into JSON as "space" ๐๐ป
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.