Comments (17)
Oh I see, I didn't notice the total time for the dirstream
one. Makes sense, although without knowing more about why it's behaving that way I don't think we can rule out a performant pipes
implementation being possible.
Anyway, it would be good to see your measurements on the hfsnotify
functions but here's the way I'm thinking about this -- please let me know how it matches up with your priorities.
I think we could invest effort in optimizing the crucial section here, which seems to be the findDirs
function. Figuring out which existence check functions are most performant seems prudent. But it seems the thing that would make the biggest change would be to change to using RawFilePath
or similar. Which I think is a great idea, but it would be a bit of work to do it without changing the API of this library. (At some point RawFilePaths
would need to be decoded into FilePaths
and any errors or decoding discrepancies dealt with.)
On the other hand, it sounds like what you really need is a way to avoid running over gigantic directories like node_modules
in the first place, right? It's always better to not create watches you don't need, since each watch comes with a kernel memory cost and brings you closer to hitting your inotify watch limit. The fact that you're getting perf problems from this directory traversal suggests that you're creating too many watches :P. Doing a filtered version of watchTree
like @Janiczek mentioned would not be difficult.
from hfsnotify.
Thanks! Interesting how the biggest amount of time is actually in the getDirectoryContentsPath.exists
function. I just pushed a commit that might help when newer versions of System.Directory
are available: 4bb7d27.
Hmm, this directory listing function will still realize all the directories in memory at once, right? If we're optimizing that then I'd rather do it in a streaming fashion so the peak memory usage is low.
from hfsnotify.
Okay, I made a stab at an efficient implementation. There are a few TODOs before it's ready to ship but the tests are passing. Want to try it? I looked at your lib for reference on the dirstream stuff :)
https://github.com/haskell-fswatch/hfsnotify/blob/master/src/System/FSNotify/Linux.hs#L193
For the filtering stuff, I think we want to just add a new watchTree function that takes an extensible options object:
watchTree' :: WatchManager -> RecursiveListeningOptions -> FilePath -> ActionPredicate -> Action -> IO StopListening
data RecursiveListeningOptions = RecursiveListeningOptions {
recursiveListeningOptionsDirectoryFilter :: FilePath -> Bool -- IO Bool ?
}
defaultRecursiveListeningOptions = RecursiveListeningOptions {
recursiveListeningOptionsDirectoryFilter = const True
}
I haven't had time to do the filtering stuff yet (it will need to be added to each platform) but a PR would be welcome if you want it soon.
from hfsnotify.
Hmm, I don't see anything that looks O(N^2) but a couple things jump out at me in the recursive watch code
- The recursive watch code first gathers all subdirectories into a list before adding any watches, which is probably not great for memory usage and may result in higher CPU usage due to garbage collection (?)
- The code then loops over these subdirectories and adds a watch to each of them. Each of these operations requires waiting on a global
MVar
, and also creating one newMVar
per watch. Either of these things might be causing high CPU.
We could improve 1) by gathering directories and adding watches in a streaming fashion, perhaps with pipes
or conduit
. We could perhaps improve 2) by restructuring our use of MVar
s, but it would have to be done carefully.
Any chance you could create a flamegraph with https://hackage.haskell.org/package/ghc-prof-flamegraph ?
from hfsnotify.
I will try to find time to generate the flamegraph and get back to you.
By the way you should be able to reproduce this behaviour solely based on the number of directories watched over -- tens of thousands of directories should provoke the behaviour.
By the way 2, my friend @turboMaCk has tried a continuation-passing-style approach to listing contents of a directory recursively that seems to be pretty performant even on higher file/dir counts:
https://github.com/turboMaCk/unix-recursive
EDIT: given the profile info in the original post, I think your point 2 won't be the culprit, since it didn't show up in the profile info. It all seems to be in directory manipulation, canonicalization etc.
from hfsnotify.
GitHub didn't let me upload a SVG, so here is a gist:
https://gist.github.com/Janiczek/f3bb8a1be8841b163aa233164bc67856
At the bottom there's a flamegraph attached.
from hfsnotify.
@thomasjm There's also a possibility (that I'm currently doing in my app) of filtering certain folder IDs and not recursing deeper into them. I don't know how that fits into the FSNotify API / responsibilities, but perhaps it could be a configuration parameter. FilePath -> Bool
for whether you should care about a folder, eg. folder /= "node_modules"
or not $ Glob.match ignoredDirPattern folder
.
That could cut a large chunk of unneeded work in certain usecases (like mine).
from hfsnotify.
@thomasjm Is there a way I could try out the mentioned commit without a proper release? Does Stack support specifying the version with a commit hash perhaps?
from hfsnotify.
filtering certain folder IDs and not recursing deeper into them
Well, you could use the non-recursive watch function to set the watches on exactly the folders you want.
Does Stack support specifying the version with a commit hash perhaps?
Yes, like this in stack.yaml
:
extra-deps:
- git: https://github.com/haskell-fswatch/hfsnotify.git
commit: 4bb7d27e3217207537f544e6e6602bfe55faf930
Hopefully the work in progress I have going on in master
branch doesn't cause problems, I was working on some breaking changes for the next release. If it does you may need to cherry-pick--you can point to a local checkout in stack.yaml
like this:
packages:
- .
- /path/to/your/hfsnotify/checkout
from hfsnotify.
sorry to jump in. I'm sort of in the middle of identifying different pieces which could be optimized to work better on large FS trees. I'm yet to identify exactly where the wins are but I'll let you know if I find something either by opening PR if there are some quick wins or by opening an issue where we can discuss more specific details.
from hfsnotify.
I had a deeper look into this issue both by analyzing isolated things as well as directly analyzing the performance of code @Janiczek observed this behavior on. For the record it's this commit Janiczek/elmid@8cf9636 but I also needed to change few hard-coded paths to make it possible for me to use.
I'm able to reproduce higher than optimal CPU loads but as far as I can say code isn't O(n^2). It scales close to O(n) based on number of directories and files - that's being said I did not put much more effort into analyzing big O once I discovered the scaling is acceptable.
For this and similar as this cases the biggest problem is inability of watchTree
to filter directories to which to recurse into. The primary cost is simply recursion into directories and files which are then ignored using predicate anyway.
I also have few points to @thomasjm's first reply:
Hmm, I don't see anything that looks O(N^2) but a couple things jump out at me in the recursive watch code
1. The recursive watch code first gathers all subdirectories into a list before adding any watches, which is probably not great for memory usage and may result in higher CPU usage due to garbage collection (?) 2. The code then loops over these subdirectories and adds a watch to each of them. Each of these operations requires waiting on a global MVar, and also creating one new MVar per watch. Either of these things might be causing high CPU.
We could improve 1) by gathering directories and adding watches in a streaming fashion, perhaps with pipes or conduit. We could perhaps improve 2) by restructuring our use of MVars, but it would have to be done carefully.
-
I spent quite some time trying to look for what is the optimal way to implement listing. See results of my measurements https://github.com/turboMaCk/unix-recursive#comparision-with-other-libraries
- TLDR is that something like pipes/conduit would actually make performance worse but they do make it easy to keep lazyio and memory under control
- memory profile of fsnotify is actually great considering the work it's doing
-
is likely one of the which slow fsnotify down but there will likely be a tradeoff with different implementations.
from hfsnotify.
Thanks for the information @turboMaCk, interesting stuff. A few thoughts/questions:
- It looks like a large amount of the performance improvement in your lib explained by just the switch from
FilePath
toRawFilePath
. Although it's a little hard to compare. Are you just recording overall CPU usage samples on your machine? It might help to smooth it out. Or to be really fancy, you could consider perf. - Any chance you could run this same test on the
System.FSNotify.Path.findDirs
function ofhfsnotify
? I'd be especially curious if the new commit I mentioned above makes a difference, since it's looking like functions such asdoesFileExist
are a lot more expensive than I would have expected. - Could you explain this statement a bit more?
pipes/conduit would actually make performance worse but they do make it easy to keep lazyio and memory under control
I'm confused because your results for dirstream
, the one which uses pipes
, seems to show the opposite (relatively good CPU usage but leaking memory)
from hfsnotify.
@thomasjm I know the answer to 1: it's the psrecord
tool that only records one process ID (and optionally its children). So the graphs in @turboMaCk's repo don't show the whole system but just the measured apps.
from hfsnotify.
-
Yes there is a huge difference between using packed
ByteString
an[Char]/String
. Once you get close to optimal implementation the overhead of linked list is quite a killer in both CPU time and memory but also for these measurements I'm essentially creating a list of all paths on good portion of my$HOME
. I'm not recording overall CPU just the process as @Janiczek said. Psrecord is like perf. Here is a script I used to measure -
I'll try to measure two versions before and after the commit.
I'm confused because your results for dirstream, the one which uses pipes, seems to show the opposite (relatively good CPU usage but leaking memory)
Yes but it took the same program 70s which is 10x longer to finish than RawFilePath
implementation and more that twice as long as FilePath
one. Which is still quite a good result imo but it's also hardest implementation to use and it has most dependencies including deprecated system-filepath
.
from hfsnotify.
So I have a result for an implementation from master/HEAD. I've ran all other tests as well and basically the performance of other implementations is same as in link I posted so I won't spam this issue further.
So CPU time is on par with dirstream, however current implementation consumes near 10GB of ram which is consistent with forcing the list too early from my other measurements.
I also agree with all what you say in last post. I think in ideal case there would be 2 separate issues and PRs coming from this issue.
a) provide new recursive API can help folks avoid into this trap of doing more work than is obvious/needed.
b) optimize System.FSNotify.Path
functions to produce results lazily
I can help you with tackling these if you want. I think especially b) should be doable without changes in API even of this internal module. a) will involve some API design
Makes sense, although without knowing more about why it's behaving that way I don't think we can rule out a performant pipes implementation being possible.
While I think using conduit or pipes is a good way to develop application I personally don't like the idea to use it for library internals unless really necessary. Libraries with less dependencies easier to maintain and are often preferred by developers because they don't make as many changes for them.
Reproducing my results:
All the code including fsnotify implementation can be found there: https://github.com/turboMaCk/unix-recursive/tree/test-fsnotify-functions (make sure you're using branch test-fsnotify-functions
).
just be aware
- this won't work on windows because of posix specific functionality in my implementation
bin
flag needs to be passed during compilation to produce binaries. with stack this is--flag unix-recursive:bin
- I have nix integration for stack enabled in
stack.yaml
. If you want to use stack but don't use nix use--no-nix
flag measure.sh [filepath]
could be used to compare all implementations- script might need some small adjustments in paths based on platform (unless you use nixos)
- it's good idea to run it at least twice because kernels FS cache can have an effect on performance of this type of IO.
EDIT:
Sorry one more thing I forget to mention. I think the fact that recursive directory walking is not as optimal as it should be at the moment is really not that big deal because with current watchTree
implemetnation, fsnotify
attaches watcher to all these files so there simply always will be resources allocated for all inodes it walks through.
from hfsnotify.
But it seems the thing that would make the biggest change would be to change to using
RawFilePath
or similar.
If this happens, could we please also export the underlying RawFilePath
versions of everything? Then the existing API can just wrap all those functions in pack
and unpack
.
That's what I've always wanted from this library - I end up converting all the FilePath
s to/from RawFilePath
in my own code anyway.
EDIT: The time of an abstract OsPath
is finally almost upon us, so probably best to wait for that.
from hfsnotify.
https://man7.org/linux/man-pages/man7/fanotify.7.html might be able to help with this?
from hfsnotify.
Related Issues (20)
- removeWatch: invalid argument (Bad file descriptor)
- 0.4 release HOT 17
- stuck when compiling HOT 13
- Build failure on FreeBSD HOT 14
- Problems with fsnotify-0.4.0.1 on Windows HOT 6
- System.FSNotify does not export type WatchConfig HOT 9
- Build failure on windows for fsnotify-0.4.0.1 HOT 1
- Terminating steeloverseer+tmux triggers "Error removing watch:" HOT 2
- No modification notifications for symlinks HOT 5
- Doesn't work on macOS unless -N is specified HOT 1
- Allow debouncing regardless of if event is same file HOT 5
- Exceptions in Action gets swallowed HOT 7
- Drop shelly dependency HOT 1
- Does creation/deletion of a hard-link trigger a modification event? HOT 2
- Windows 10, no Event unless Pulling is forced. HOT 6
- Debounce config works in very unexpected way (only on Win32) HOT 1
- When events fire too quickly, the whole watch stops HOT 1
- `watchDir` misses some Git events, `watchTree` shows them all on Linux HOT 6
- Fire event variety "CloseWrite" HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hfsnotify.