Coder Social home page Coder Social logo

Comments (10)

jrhemstad avatar jrhemstad commented on September 25, 2024

If I remember correctly, @jcohen-nvidia told me that nvtxRangeEnd can automatically infer the domain.

Looking at the header, it says something to the same effect:

* \remarks This function is offered completeness but is an alias for ::nvtxRangeEnd.
* It does not need a domain param since that is associated iwth the range ID at ::nvtxDomainRangeStartEx

That said, I have no issue with adding the domain to end_range for completeness.

Also, I am wondering whether it is really necessary to have the free-standing start_range and end_range functions. The domain_process_range class should be sufficient to handle all possible use cases and has the added benefit of preventing the end-users from misusing the API (e.g. generating an unterminated range, passing a wrong handle to end_range, etc.)

I 100% agree, but I think I was overruled. IIRC, in the DL frameworks NVTX stuff they needed the explicit start_range/end_range because they could only pass around a primitive type like an int in Python and couldn't move around the domain_process_range object.

from nvtx.

AntoineFroger avatar AntoineFroger commented on September 25, 2024

If I remember correctly, @jcohen-nvidia told me that nvtxRangeEnd can automatically infer the domain.

The implementation could effectively cache the domain handle into the range handle, if the latter points to the address of a structure. But there is no guarantee that this is true and the implementation could very well just return an incremental ID as range handle without caching any context on which domain the range was started on. For instance, this is what Nsight Systems currently do.

I 100% agree, but I think I was overruled. IIRC, in the DL frameworks NVTX stuff they needed the explicit start_range/end_range because they could only pass around a primitive type like an int in Python and couldn't move around the domain_process_range object.

Make sense, thanks for the explanation!

from nvtx.

jcohen-nvidia avatar jcohen-nvidia commented on September 25, 2024

I think I was assuming that handles returned from nvtxDomainRangeStart should be pointers, and a tool could handle getting a ...RangeEnd call on that thing properly, regardless of whether the correct (or any) domain was passed to the RangeEnd call. But I realize now this is dumb, because we are moving forward with domain filtering, and it's much easier for tools if we require that an nvtxDomainRangeStart handle in domain D be always used with nvtxDomainRangeEnd in D as well. That lets tools avoid the indirection to get the domain. Also, this would open up the door for tools to have range handles be incrementing integers per domain. So I'm happy to overrule my old overruling. :)

from nvtx.

jcohen-nvidia avatar jcohen-nvidia commented on September 25, 2024

Also, I think there's no reason to leave the perf on the table of redoing the ::get on the domain. We might as well just store the domain handle in the class during the constructor, so the destructor can use it for free.

from nvtx.

jcohen-nvidia avatar jcohen-nvidia commented on September 25, 2024

And as for having free-standing functions to do start_range and end_range, yeah, originally we did not have these. Jon Dekhtiar added them because they wanted to be able to make the calls at arbitrary points in their Python code, without having to create or destroy an object to make it happen. And all these garbage-collected languages aren't good about RAII for perf measurements because they don't like to destroy objects when you ask them to. So having a Python object to model a domain_process_range is more awkward for them than free functions, where it's their responsibility to make sure the end call happens. In general that's just an issue with process_range vs. thread_range... If you're going to pass the thing around, you have to make sure it gets destroyed at exactly the point where you want that timeline range to end.

from nvtx.

jcohen-nvidia avatar jcohen-nvidia commented on September 25, 2024

And to Jake's comments about renaming domain_process_range to unique_range, I think the challenge is to make sure that users know where to expect these things to appear in tools. We used to always show push/pop range rows as children of thread rows, and start/end range rows as children of process rows. But that's gotten less popular since Nsys now shows Start/End ranges under a thread if it starts or ends on that thread. This is something people have asked for -- even though it's less simple to explain, it tends to make users happier. So yeah, the whole "thread range" and "process range" nomenclature deserves to be reconsidered now that the hard process vs. thread distinction is not how the tool works anymore. I do like unique_range for something that's literally a wrapper around unique_ptr. Probably still want the domain vs. no-domain versions. I do kind of prefer the in suffix over the domain prefix, because I think it just looks nicer to see e.g. unique_range_in over domain_unique_range. What do you guys think? Also, if we were to change thread_range as well, I think the name I like most is "stack_range". I thought about "scope_range" or "scoped_range" as well, but I lean toward "stack" because it directly invokes thoughts of "push" and "pop", and also, we implemented it to err out on heap usage so that it can only be used on the stack. If you guys think this renaming makes sense, I'd be for it.

from nvtx.

jrhemstad avatar jrhemstad commented on September 25, 2024

I think the name I like most is "stack_range". I thought about "scope_range" or "scoped_range" as well, but I lean toward "stack" because it directly invokes thoughts of "push" and "pop", and also, we implemented it to err out on heap usage so that it can only be used on the stack. If you guys think this renaming makes sense, I'd be for it.

I actually like scoped_range better. Going off of my analogy of unique_range being like a unique_lock, scoped_range is evocative of std::scoped_lock.

from nvtx.

jcohen-nvidia avatar jcohen-nvidia commented on September 25, 2024

Oh, and one more thing I just realized, since naming is of course the hardest part of programming... If STL is potentially going to add a std::unique_resource class, then maaaaaybe we shouldn't also add an nvtx3::unique_resource. But I was thinking that if we have a unique_range for RAIIing start/end ranges, I would also like a RAII wrapper at some point for the NVTX generic resource create/destroy functions. So I immediately thought "unique_resource", and then realized why that name sounded nice, haha. Maybe nvtx3::unique_named_resource, but that's getting long. So perhaps just nvtx3::resource, and explain it's like a unique_ thing?

from nvtx.

jcohen-nvidia avatar jcohen-nvidia commented on September 25, 2024

If you guys like "scoped_range" better than "stack_range" I am fine with that too. I did think of scoped_lock/unique_lock. But ours don't work exactly the same as those, so not sure if that's good or not.

from nvtx.

AntoineFroger avatar AntoineFroger commented on September 25, 2024

Closing. The discussion about renaming should be continued in #40.

from nvtx.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.