Coder Social home page Coder Social logo

Comments (11)

wez avatar wez commented on May 4, 2024

We've had a couple of incidences of this type of issue where something about the mach ports goes bad.
Restarting the watchman server is the only effective workaround at the moment.
The high level game plan for resolving this is to try to detect this in the application (that error doesn't appear to bubble up to our code) and self-terminate.

from watchman.

amasad avatar amasad commented on May 4, 2024

In my case it required a full machine reboot. Maybe the application could tail the log and look for this error and say something to the user.
Maybe this could be interesting to you. Node v0.10 suffers from the same problem but it seems to have been fixed in v0.11. Here is the relevant commit:
joyent/libuv@cd2794c

from watchman.

wez avatar wez commented on May 4, 2024

Do you have a large number of watched roots?
I'll keep the libuv approach in mind when I sit down to look at this in earnest

from watchman.

amasad avatar amasad commented on May 4, 2024

a subdir inside fbobjc. It has 12964 files. EDIT: roots are 2399

from watchman.

wez avatar wez commented on May 4, 2024

in watchman a root is a top level watched dir. You can count the number of roots by counting the number of entries in the watchman watch-list roots field. Does that match your 2399 number?

Some work in progress that I think will help with this issue:

https://reviews.facebook.net/D28161 - use launchd to spawn/shutdown watchman on logout
https://reviews.facebook.net/D28179 - do a better job at surfacing this class of error

If you do have a very large number of watches I'd like to understand what's going on there. Since that may be FB specific, please reach out to me offline from GitHub to go over that part and then we can summarize back here.

from watchman.

wez avatar wez commented on May 4, 2024

So, I can reproduce this specific error with the following test scenario:

for i in `seq 1 4096` ; do echo $i ; mkdir /tmp/wat/$i ; watchman watch /tmp/wat/$i ; done
...
414
{
    "version": "2.9.9",
    "error": "FSEventStreamStart failed"
}

when running with the diffs I mentioned above, we'll report an error when we get to around 400 or so roots. (Without them, we'll silently pretend that all is good).

In the log file:

2014-11-01 15:13 watchman[78364] (FSEvents.framework) FSEventStreamStart: register_with_server: ERROR: f2d_register_rpc() => (null) (-21)

Other systems have limits on the number of roots that they support (eg: Linux has /proc/sys/fs/inotify/max_user_instances as a configurable limit), and given that this is a pretty sizable number of roots, and that adding machinery to multiplex over the same stream will impose some perf costs in trying to match every path change to a root (involves locking the watch list and finding the longest prefix match for every file change event, vs. what is currently "free"), I'm not thrilled at the idea of supporting this if we can solve it more simply.

For example, if you are watching multiple dirs within a project, consider instead watching a common root at a higher level in the tree.

In our devserver environments we deploy watchman with https://facebook.github.io/watchman/docs/config.html#root-restrict-files to force watches to match up to repo roots. It is unusual for a typical engineer to have so many repos checked out and in active use, so this shouldn't be particularly onerous.

Can you tell me more about how you're using watchman here?

Cleanup wise, no reboot is needed; to repair this state:

watchman shutdown-server
rm $TMPDIR/.watchman.$USER.state

from watchman.

amasad avatar amasad commented on May 4, 2024

I was actually just experimenting with it when I hit the bug. We are currently using Node for watching and we hit the above-mentioned Node bug so I started playing with watchman when I hit this bug. We were indeed watching a number of dirs that had a common ancestor that we could be watching instead. I just looked and it looks like the # of dirs is around 377.

Now, it could be that fseventsd was already in a bad state when I started playing around with watchman which made it easier to hit this bug. Anyways I want to try again tomorrow by watching a common ancestor. Are you planning on doing a new release after landing the diffs you mentioned?

from watchman.

sunshowers avatar sunshowers commented on May 4, 2024

@amasad -- I'm responsible for deploying watchman on FB corporate laptops -- we'll probably do one next week.

from watchman.

wez avatar wez commented on May 4, 2024

783036b and e6367a5 have landed

from watchman.

wez avatar wez commented on May 4, 2024

I've tagged 3.0.0 and pushed an fb-watchman client for node to npm (ef860b0) to help tackle the node side of this issue.
I've also opened a PR to update homebrew (Homebrew/legacy-homebrew#33883).

I think we've covered the various aspects of this, and are just pending deployment to FB infra (which we can track offline from GitHub).

Let's close this and if further issues arise, track them separately.

from watchman.

amasad avatar amasad commented on May 4, 2024

Awesome, thanks for the quick turnaround!

from watchman.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.