Comments (10)
If I remember correctly, @jcohen-nvidia told me that nvtxRangeEnd
can automatically infer the domain.
Looking at the header, it says something to the same effect:
NVTX/c/include/nvtx3/nvToolsExt.h
Lines 720 to 721 in c09c47a
That said, I have no issue with adding the domain to end_range
for completeness.
Also, I am wondering whether it is really necessary to have the free-standing start_range and end_range functions. The domain_process_range class should be sufficient to handle all possible use cases and has the added benefit of preventing the end-users from misusing the API (e.g. generating an unterminated range, passing a wrong handle to end_range, etc.)
I 100% agree, but I think I was overruled. IIRC, in the DL frameworks NVTX stuff they needed the explicit start_range/end_range
because they could only pass around a primitive type like an int
in Python and couldn't move around the domain_process_range
object.
from nvtx.
If I remember correctly, @jcohen-nvidia told me that nvtxRangeEnd can automatically infer the domain.
The implementation could effectively cache the domain handle into the range handle, if the latter points to the address of a structure. But there is no guarantee that this is true and the implementation could very well just return an incremental ID as range handle without caching any context on which domain the range was started on. For instance, this is what Nsight Systems currently do.
I 100% agree, but I think I was overruled. IIRC, in the DL frameworks NVTX stuff they needed the explicit start_range/end_range because they could only pass around a primitive type like an int in Python and couldn't move around the domain_process_range object.
Make sense, thanks for the explanation!
from nvtx.
I think I was assuming that handles returned from nvtxDomainRangeStart should be pointers, and a tool could handle getting a ...RangeEnd call on that thing properly, regardless of whether the correct (or any) domain was passed to the RangeEnd call. But I realize now this is dumb, because we are moving forward with domain filtering, and it's much easier for tools if we require that an nvtxDomainRangeStart handle in domain D be always used with nvtxDomainRangeEnd in D as well. That lets tools avoid the indirection to get the domain. Also, this would open up the door for tools to have range handles be incrementing integers per domain. So I'm happy to overrule my old overruling. :)
from nvtx.
Also, I think there's no reason to leave the perf on the table of redoing the ::get on the domain. We might as well just store the domain handle in the class during the constructor, so the destructor can use it for free.
from nvtx.
And as for having free-standing functions to do start_range and end_range, yeah, originally we did not have these. Jon Dekhtiar added them because they wanted to be able to make the calls at arbitrary points in their Python code, without having to create or destroy an object to make it happen. And all these garbage-collected languages aren't good about RAII for perf measurements because they don't like to destroy objects when you ask them to. So having a Python object to model a domain_process_range is more awkward for them than free functions, where it's their responsibility to make sure the end call happens. In general that's just an issue with process_range vs. thread_range... If you're going to pass the thing around, you have to make sure it gets destroyed at exactly the point where you want that timeline range to end.
from nvtx.
And to Jake's comments about renaming domain_process_range to unique_range, I think the challenge is to make sure that users know where to expect these things to appear in tools. We used to always show push/pop range rows as children of thread rows, and start/end range rows as children of process rows. But that's gotten less popular since Nsys now shows Start/End ranges under a thread if it starts or ends on that thread. This is something people have asked for -- even though it's less simple to explain, it tends to make users happier. So yeah, the whole "thread range" and "process range" nomenclature deserves to be reconsidered now that the hard process vs. thread distinction is not how the tool works anymore. I do like unique_range for something that's literally a wrapper around unique_ptr. Probably still want the domain vs. no-domain versions. I do kind of prefer the in suffix over the domain prefix, because I think it just looks nicer to see e.g. unique_range_in over domain_unique_range. What do you guys think? Also, if we were to change thread_range as well, I think the name I like most is "stack_range". I thought about "scope_range" or "scoped_range" as well, but I lean toward "stack" because it directly invokes thoughts of "push" and "pop", and also, we implemented it to err out on heap usage so that it can only be used on the stack. If you guys think this renaming makes sense, I'd be for it.
from nvtx.
I think the name I like most is "stack_range". I thought about "scope_range" or "scoped_range" as well, but I lean toward "stack" because it directly invokes thoughts of "push" and "pop", and also, we implemented it to err out on heap usage so that it can only be used on the stack. If you guys think this renaming makes sense, I'd be for it.
I actually like scoped_range
better. Going off of my analogy of unique_range
being like a unique_lock
, scoped_range
is evocative of std::scoped_lock
.
from nvtx.
Oh, and one more thing I just realized, since naming is of course the hardest part of programming... If STL is potentially going to add a std::unique_resource class, then maaaaaybe we shouldn't also add an nvtx3::unique_resource. But I was thinking that if we have a unique_range for RAIIing start/end ranges, I would also like a RAII wrapper at some point for the NVTX generic resource create/destroy functions. So I immediately thought "unique_resource", and then realized why that name sounded nice, haha. Maybe nvtx3::unique_named_resource, but that's getting long. So perhaps just nvtx3::resource, and explain it's like a unique_ thing?
from nvtx.
If you guys like "scoped_range" better than "stack_range" I am fine with that too. I did think of scoped_lock/unique_lock. But ours don't work exactly the same as those, so not sure if that's good or not.
from nvtx.
Closing. The discussion about renaming should be continued in #40.
from nvtx.
Related Issues (20)
- NVTX can't annotate backward process of AI model training
- NVTX C++ availability HOT 2
- Failed to build nvtx HOT 3
- scoped_range does not work with domain::global HOT 1
- `nvToolsExt.h` defines min/max macros on Windows HOT 2
- NvToolExt_LIBRARIES-NOTFOUND
- how can i get nvToolsExt64_1.dll and nvToolsExt64_1.lib HOT 1
- Python 3.11 support HOT 3
- PyPi README missing
- pip install nvtx on macOs HOT 3
- Wheels for Mac OS and Windows HOT 1
- How to use NVTX in device code. HOT 2
- [python] Automatic annotation with function name HOT 1
- Payloads in python events? HOT 11
- __sync_val_compare_and_swap used incorrect parameter order
- `NVTX3_CPP_REQUIRE_EXPLICIT_VERSION` is problematic in header-only libraries HOT 1
- Simplify the process of using NVTX in another CMake project HOT 1
- Seeking for some explanations on the meaning of terminology in nvtx.h and nvToolsExtPayload.h
- Will NVTX3 ship the C++ V1 API for eternity?
- Meet cmake error when building nvtx3 on Jetson Orin
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nvtx.