Comments (5)
Submitted the issue without the description by accident, edited with my actual question ^
from bazel-buildfarm.
Posted an example configuration that demonstrates this problem when building tools_remote
: #1343 (comment)
from bazel-buildfarm.
This action in particular, ImageLayer to produce a tar, is a proportionally labor intensive task: a small perturbation in the input set always creates a differentiated output that is proportionate to the whole input set size. This is also an IO, rather than compute, bound task, which will be exaggerated by the iops capacity of your buildfarm workers, leaving you in the cpu/ram unused state you see.
I recommend using --remote_grpc_log to enumerate the requests being performed and build a timescale of each request to break down where the performance gap exists. These will overlap, but there will be some obvious criticality to the request sequence: FMB will complete before Writes, which will complete before Execute, which will complete before Reads. Network and disk IO measurements during the execution period will also be illuminating, if either of them happen to be pegged to their capacities.
Such a breakdown will point by comparison between systems to the capacity for improvements, and let you know where your bottleneck exists, and where engineering effort for buildfarm could improve them.
from bazel-buildfarm.
Thanks, @werkt . I suspected that something about the IO would be the bottleneck here and will use your advice in tracking down the exact bottleneck. In general, would you recommend that docker image rules be executed locally or are there cases that still merit executing remotely?
Separately, I have an example of (what I think is) a non-IO-bound build that takes much longer on remote than on local -- would you be able to take a look at that? Much appreciated! #1343
from bazel-buildfarm.
I strongly recommend executing that particular item locally. There has been historically some issues identifying these with tags (not sure precisely the current state), but the failsafe way is the one Alex Eagle recommends here: https://blog.aspect.dev/bazelrc-flags (look for --modify_execution_info). If you substitute PackageTar for ImageLayer (and definitely the digest summing mnemonics of rules_docker), you will be able to ensure they run locally; Even a 9x cost of the 'commercial' RE is too much to pay, and you've got to be concerned in that instance in how much bandwidth you're attributing to each individual request out of the budget for a machine.
Will have a look at that task regarding non-IO compromises
from bazel-buildfarm.
Related Issues (20)
- expire Operation in backplane HOT 5
- [Scheduler] Exception notifying context listener HOT 1
- Are workers in RemoteCasWriter fixed whenever any new storage workers are added afterwards? HOT 2
- ci: windows tests fail very often HOT 2
- image bazelbuild/buildfarm-worker:v2.7.0 fails to start with "libfuse.so.2: cannot open shared object file: No such file or directory" HOT 8
- First GRPC type storage tries to create Fuse Exec FS
- Buildfarm is failing at Bazel@HEAD
- Add an optional filter to limit artifact sizes by Action HOT 4
- Post Local Clean Java Coverage Builds Against Remote K8s Build Farm Result In Invalid Digest Recieved HOT 6
- Diffrence between execution and CAS shard worker HOT 3
- Why does clang work, but llvm-ar not? HOT 4
- rules_oss_audit fails to install dependencies on mac
- Set up OSSF security scorecards
- poisson_distribution_test is failing with BAZEL@HEAD HOT 2
- External dependency of buildfarm fails with bzlmod
- Redis Hot Shard issue due to DispatchMonitor HashMap
- skipLoad looping can exhaust file path length
- Heuristics for controlling putDirectory (linkedInputDirectories) per action
- "./examples.bf-run start" fails HOT 2
- How to obtain remote system information? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bazel-buildfarm.