Comments (8)
Hi,
For alltoall, I would recommend using
-DENABLE_NPKIT_EVENT_SEND_ENTRY
-DENABLE_NPKIT_EVENT_SEND_EXIT
-DENABLE_NPKIT_EVENT_RECV_ENTRY
-DENABLE_NPKIT_EVENT_RECV_EXIT
for GPU events, and
-DENABLE_NPKIT_EVENT_NET_SEND_ENTRY
-DENABLE_NPKIT_EVENT_NET_SEND_EXIT
-DENABLE_NPKIT_EVENT_NET_RECV_ENTRY
-DENABLE_NPKIT_EVENT_NET_RECV_EXIT
for net events.
from npkit.
I have tested the settings for GPU events, but the json output is "{"traceEvents": [], "displayTimeUnit": "ns"}" and npkit_dump dir is empty.
Should I change the NCCL_PROTO and NCCL_ALGO settings?
from npkit.
There should always be files in NPKIT_DUMP_DIR. Which NPKit version did you use?
from npkit.
NPKit_NCCL
from npkit.
Got it. That version only LL and LL128 protocol are supported for those events.
We recommend using the NPKit for MSCCL instead of NPKit for NCCL because the latter is no longer actively maintained. MSCCL and NPKit for it are being actively developed, and MSCCL in terms of functionality is a superset of NCCL.
from npkit.
Thanks a lot! Can I profile the trace file of the original nccl implementation in msccl framework?
from npkit.
Itβs better to process the trace files using their corresponding scripts.
from npkit.
I have changed the NCCL_PROTO from "Simple" to "LL". The output trace of NET events is normal, but there are some errors when generating GPU events.
from npkit.
Related Issues (8)
- Unable to generate GPU traces for MSCCL HOT 5
- Empty trace file HOT 11
- Question about the misalignment of the generated files HOT 1
- NPKit for NCCL 2.18 HOT 2
- Can NPKit only trace workloads launched by MSCCL tests or more than that HOT 4
- What does the index in the tracing result mean? HOT 1
- Some question regarding time scale in npkit_trace_generator.py
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from npkit.