Comments (5)
because I am not familiar with java, I asked chatgpt about RedisShardBackplane getStorageWorkers,and i want to know if it's right,this is it‘s answer:
Based on the error log, it seems that there is a thread blocking issue occurring within the synchronized block of the getStorageWorkers() method. Specifically, the thread "grpc-default-executor-7603" is being blocked while waiting to acquire the lock on the RedisShardBackplane object.
To address this issue, you can consider the following:
Reduce the execution time of the synchronized block: Ensure that operations within the synchronized block are completed quickly. Avoid performing time-consuming operations within the synchronized block, especially avoid calling methods that may cause blocking.
Avoid calling potentially blocking methods within the synchronized block: The call to client.call(this::fetchAndExpireStorageWorkers) within the synchronized block may cause blocking. You may consider moving this call outside the synchronized block or executing it asynchronously to prevent blocking other threads.
Optimize synchronization of data structures: While the storageWorkerSet is being operated on within the synchronized block to prevent concurrent modifications, there may still be issues with concurrent modification. You can use thread-safe data structures or implement finer-grained synchronization when modifying the collection.
Based on these suggestions, you can try modifying the code as follows:
@SuppressWarnings("ConstantConditions")
@Override
public synchronized Set<String> getStorageWorkers() throws IOException {
if (storageWorkersDeadline == null || storageWorkersDeadline.isExpired()) {
Set<String> newWorkerSet = client.call(this::fetchAndExpireStorageWorkers);
synchronized (storageWorkerSet) {
storageWorkerSet.clear();
storageWorkerSet.addAll(newWorkerSet);
}
storageWorkersDeadline = Deadline.after(workerSetMaxAge, SECONDS);
}
synchronized (storageWorkerSet) {
return new HashSet<>(storageWorkerSet);
}
}
In this modification, the call to client.call(this::fetchAndExpireStorageWorkers) has been moved outside the synchronized block to avoid potential blocking issues within the synchronized block. Additionally, synchronization has been added when returning the storageWorkerSet to ensure that there are no concurrent modification issues when returning the collection.
from bazel-buildfarm.
It's similar to issue 1426: #1426
from bazel-buildfarm.
You're running at least 6 month old code, as ShardInstance.java no longer exists. Please update and see if this hang reproduces.
There are no longer any synchronizations in getStorageWorkers that could cause this.
from bazel-buildfarm.
You're running at least 6 month old code, as ShardInstance.java no longer exists. Please update and see if this hang reproduces.您正在运行至少 6 个月前的代码,因为 ShardInstance.java 已不存在。请更新并查看此挂起是否会重现。
There are no longer any synchronizations in getStorageWorkers that could cause this.getStorageWorkers 中不再有任何可能导致此问题的同步。
Okay, thanks. I'll upgrade and see how it goes.
from bazel-buildfarm.
Going to assume this has been corrected. Closing for now, feel free to reopen if there's anything to add here.
from bazel-buildfarm.
Related Issues (20)
- Support Multiple Hashing Function
- Remote execution service executing processes locally HOT 4
- ERROR: Failed to query remote execution capabilities: UNAVAILABLE: io exception HOT 4
- Feasibility Analysis of Using Buildfarm for Large-Scale Development HOT 2
- Verbose Logging for Servers and Workers through Helm Chart
- Check logs during remote execution HOT 4
- Incorrect container port for Shard-worker in Helm template
- hardlinks in CAS leads to task failure in some cases HOT 2
- Fetch asset support for credential use
- [Bzlmod] No repository visible as '@maven' from main repository HOT 1
- Querying remote cache failed due to Missing Digest HOT 5
- Helm chart won't deploy workers because {ready,live}ness probes are using the wrong port
- When is the release? HOT 1
- Helm chart deployment storage issues
- FindMissingBlobs histogram has too few buckets
- CGroups v2 HOT 1
- RedisShardSubscription::stop is unsafe
- 无任务时redis CPU占用过高 HOT 2
- Transfer bazel-buildfarm to bazel-contrib HOT 3
- When configuring Bazel Buildfarm, setting actionCacheExpire to be less than casExpire is more reasonable ? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bazel-buildfarm.