Coder Social home page Coder Social logo

Backend strategy about torchquad HOT 14 CLOSED

jonas-eschle avatar jonas-eschle commented on July 23, 2024
Backend strategy

from torchquad.

Comments (14)

FHof avatar FHof commented on July 23, 2024 1

Yes, when I rewrote the VEGAS code, it was helpful for me to separate the changes into multiple commits as follows. I first replaced torch functions with those from autoray; these changes were often simple but included many lines of code (23258a4). Then I changed code so that it works with both numpy and torch (c0416f0). After that, I added details such as support for dtypes.
This made it easier to find bugs: After the first step, VEGAS executed the same torch operations (just wrapped by autoray), whereas after the second step, it used different operations, for example reshape instead of unsqueeze. Therefore bugs introduced in the first step would be caused by a misuse of autoray (e.g. forgotten like arguments), whereas bugs in the second step would be caused by a misunderstanding of differences between backends (e.g. differences between backend-agnostic type conversion and torch's .int() and .long() tensor methods).

from torchquad.

gomezzz avatar gomezzz commented on July 23, 2024

Hi @jonas-eschle !

Thank you for your kind words, glad you like the project! :)

My question is about backends. At least so far it seems that only a minor set of the torch methods are used. Since there are other libraries, notably JAX and TensorFlow, which both have a numpy-like API such as PyTorch, it seems nearly trivial to me to support this backends as well, at least for the status of the project now.

True. This would definitely be possible.

I think the limitations of all frameworks are quite similar: they are great at vectors and bat at adaptive/sequential methods. I am also aware though that it can potentially make more sophisticated integration methods more difficult to implement, such as when using control flow.

Also true. Not sure about the performance vegasflow achieves but VEGAS in torchquad has proven tricky to optimize. Scaling can never compete with more naturally parallel methods such as a vanilla monte carlo or newton-cotes rules. Maybe, there are better Monte Carlo solvers in terms of achieving good scaling 🤔. Currently, there are no concrete plans on implementing other complex integrators, although it would definitely be interesting and worthwhile (see e.g. #127).

I'm not entirely sure if supporting TensorFlow is ideal as there are other modules for it already and having to keep up with several changing APIs does of course increase complexity? I am not aware of any implementations for JAX though, which may be interesting.

In general, I think it could be feasible to do something as in this project I recently came across:

This would also allow porting only some of the integrators to other frameworks (e.g. porting VEGAS maybe more complex and for TF not really worth it given that vegasflow exists?)

Is this somewhat what you had in mind? :)

from torchquad.

jonas-eschle avatar jonas-eschle commented on July 23, 2024

Yes, exactly, this is about what I had in mind. And yes, the more complicated the method, the harder, or even impossible it gets to implement it (in a jittable and autograd way).

We also tried here actually to get some more dynamic methods in with mixed success: https://github.com/M-AlMinawi/Integration-of-single-variable-functions-using-TensorFlow

I think the backend class or similar should do the trick indeed!

For Vegas, at least the Vegas+ that vegasflow implements now, seems considerably better than plain MC, but I am not too familiar with the internals there

from torchquad.

gomezzz avatar gomezzz commented on July 23, 2024

Would it be something you would be interested in implementing? I think, ideally, one could approach this first for one of the Newton-Cotes (or plain MC) methods to see how much it breaks the codebase and how complex it would be for VEGAS+.

For Vegas, at least the Vegas+ that vegasflow implements now, seems considerably better than plain MC, but I am not too familiar with the internals there

In terms of convergence, I think, it really depends on the integrand. It needs to be sufficiently volatile to profit from the adaptiveness. In terms of runtime it can't scale as well as plain MC, I think.

I haven't had opportunity to compare to the vegasflow implementation in terms of speed, but at the moment the vegas+ in torchquad actually is faster on CPU. At least our implementation seems to not be parallel enough yet to profit from GPUs, I guess.

What kind of dimensionality do the problems you investigate have?

from torchquad.

jonas-eschle avatar jonas-eschle commented on July 23, 2024

I would surely be interested to help, e.g. also with the backend (choices), but I can't atm admit a lot of time to it.

In terms of convergence, I think, it really depends on the integrand. It needs to be sufficiently volatile to profit from the adaptiveness. In terms of runtime it can't scale as well as plain MC, I think.

Yes, we did some studies on this and it's sometimes useful and sometimes not, simply speaking

I haven't had opportunity to compare to the vegasflow implementation in terms of speed, but at the moment the vegas+ in torchquad actually is faster on CPU. At least our implementation seems to not be parallel enough yet to profit from GPUs, I guess.

I wonder as well, indeed, but we're currently investigating this anyway, I can let you know once we have some results on it.

The dimensionality goes from 1-2D problems usually up to 5-6 (so quite low dimensional still, but already tricky enough to integrate).

from torchquad.

gomezzz avatar gomezzz commented on July 23, 2024

I would surely be interested to help, e.g. also with the backend (choices), but I can't atm admit a lot of time to it.

Unfortunately, I am also rather busy at the moment. But we can start with a little requirements engineering to see if isn't fairly quick to do (see below).

I wonder as well, indeed, but we're currently investigating this anyway, I can let you know once we have some results on it.

Sure! I'm curious.

The dimensionality goes from 1-2D problems usually up to 5-6 (so quite low dimensional still, but already tricky enough to integrate).

But currently you are using vegas, as far as I understand? I'm wondering if the deterministic methods aren't still competitive here given the better scaling? But I think you tried with the thesis you linked before?

From your description I gather you would be most interested in starting with a TF backend?

Needed changes would be

  1. Implement a backend class similar to this in torchquad/utils/backend.py or torchquad/backend/ if it ends up being multiple files/classes. It has to provide access to all methods in TF and torch that are used in below selected integration method. Additionally, one ought to check that the API is the same (e.g. torch.transpose and np.transpose behave differently :S not sure about TF). This might warrant implementing a separate test for all functions in the backend to ensure consistency.
  2. Adapt one of the Newton-Cotes methods (e.g. trapezoid.py or simpson.py) to use the backend. This will likely also require integrating the backend into the integration_grid.py.
  3. Add a way for the user to choose the backend. As a start during the creation of the integrator (so constructor of above selected integration method) one can just pass an extra variable for it. This also makes it clear, which integrators support this. In the integrator, one can have something like
import tensorflow_backend as tf_backend

(...)

if selected_backend == "tensorflow":
     self.backend = tf_backend
(...)
  1. Update environment.yml to include framework (TF or such)
  2. Add some tests for it in the respective integrator test (e.g. `tests/trapezoid_test.py)

So it should not be too much work. Personally, I suspect the most annoying part is ensuring the backends match for all the functions.

from torchquad.

jonas-eschle avatar jonas-eschle commented on July 23, 2024

But currently you are using vegas, as far as I understand? I'm wondering if the deterministic methods aren't still competitive here given the better scaling? But I think you tried with the thesis you linked before?

Not yet, we're trying it out basically. And it seems to be quite better than QMC methods.

From your description I gather you would be most interested in starting with a TF backend?

Ish, TF has now a numpy like backend tensorflow.experimental.numpy that could be used (modulo control flow, that needs to be wrapped somehow). So I would suggest to use that.

Or maybe even better, something like autoray, which already wraps the low level numpy api of multiple libraries (not the gradients or control flow yet), but it could already help a lot.
Btw don't be fooled, they have a full numpy API, they just don't advertise it really.

Needed changes would be

  1. Implement a backend class similar to this in torchquad/utils/backend.py or torchquad/backend/ if it ends up being multiple files/classes. It has to provide access to all methods in TF and torch that are used in below selected integration method. Additionally, one ought to check that the API is the same (e.g. torch.transpose and np.transpose behave differently :S not sure about TF). This might warrant implementing a separate test for all functions in the backend to ensure consistency.

Yes, maybe already done with autoray or similar

  1. Adapt one of the Newton-Cotes methods (e.g. trapezoid.py or simpson.py) to use the backend. This will likely also require integrating the backend into the integration_grid.py.

yes, this needs to be done

  1. Add a way for the user to choose the backend. As a start during the creation of the integrator (so constructor of above selected integration method) one can just pass an extra variable for it. This also makes it clear, which integrators support this. In the integrator, one can have something like

Something like this, yes. But I would put that second order priority, I guess the main issue is to get it working actually.

  1. Update environment.yml to include framework (TF or such)
  2. Add some tests for it in the respective integrator test (e.g. `tests/trapezoid_test.py)

So it should not be too much work. Personally, I suspect the most annoying part is ensuring the backends match for all the functions.

Yes, I agree. And since this work can be taken by autoray, that may helps a lot

from torchquad.

gomezzz avatar gomezzz commented on July 23, 2024

Or maybe even better, something like autoray, which already wraps the low level numpy api of multiple libraries (not the gradients or control flow yet), but it could already help a lot.
Btw don't be fooled, they have a full numpy API, they just don't advertise it really.

Interesting! Thanks for pointing out autoray, that seems like a very exciting project that might help a lot on it. :)

I don't have time the next week but I think I will take a day to play around with this some time after. Or, if you want to try it out in torchquad, feel free to do so. For now, we can create a branch for this and start experimenting to see how complex it is in the end? I'll let you know when I find time. Feel free to mention if you get to it sooner than me!

But, overall autoray makes me fairly confident that this should be quite doable.

from torchquad.

jonas-eschle avatar jonas-eschle commented on July 23, 2024

Good, the we share the same view on this! I am also going to play around with it on zfit to get a feel for it, so yes, let's just start with an experimental branch and see how it goes

from torchquad.

gomezzz avatar gomezzz commented on July 23, 2024

@jonas-eschle There is now a master's student from TUM working on this as his thesis! He will be creating a branch in torchquad for this and we hope to have something running some time soon. Then, we can perform some performance and usability trials and see where it takes us!

How did it go with zfit?

from torchquad.

jonas-eschle avatar jonas-eschle commented on July 23, 2024

Hey, sorry I missed this post! We started as well, but had some other priorities in terms of design coming up to understand our general API better. But I think we will give this a try around January, I am also looking for a student currently to do that.

How is it going so far?

from torchquad.

gomezzz avatar gomezzz commented on July 23, 2024

No worries. :) Glad to hear you are still on it as well!

We are actually looking into a total conversion now as it really going quite well. Happening here https://github.com/FHof/torchquad/

It's already quite functional and we are now digging deeper into performance comparisons among the frameworks etc. to see which parts are bottlenecks on which frameworks, comparing things like using XLA, JIT etc. Progressing quite well!

from torchquad.

gomezzz avatar gomezzz commented on July 23, 2024

@FHof has fully integrated TF, numpy and JAX support in #137 :) Just merged it and will create a separate release for it soon.

@FHof Any overall thoughts on the autoray integration that one ought to keep in mind in your opinion when attempting to include autoray?

from torchquad.

gomezzz avatar gomezzz commented on July 23, 2024

Thanks!

from torchquad.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.