Coder Social home page Coder Social logo

Comments (10)

bob-carpenter avatar bob-carpenter commented on July 4, 2024

What do you think of building such functionality into BridgeStan?

I couldn't quite follow all the callbacks, so I'm not entirely sure what you are suggesting adding to BridgeStan and what that would enable for clients that they can't achieve now.

from bridgestan.

WardBrian avatar WardBrian commented on July 4, 2024

My understanding of Edward's request is that a user may be working with other code that uses ctypes, and as a result have a double array which is represented as a ctypes.POINTER(ctypes.c_double), rather than a numpy array object.

There is nothing at the C level of BridgeStan to differentiate between these, and in principle either is fine, but for convenience BridgeStan only speaks the language of numpy ndarrays, not 'raw' ctypes pointers. You can convert from a ctypes pointer to an ndarray, but since all BridgeStan really does is some bounds checking and then convert it back to a pointer internally, this comes with an extra cost at no extra benefit.

The concrete code Ed provided would give you a different function which accepts ctypes objects as opposed to numpy ones, but I also have a minimal patch prepared which would make it so either can be accepted anywhere we currently accept numpy arrays, meaning no extra functions, just slightly broader types being accepted.

from bridgestan.

roualdes avatar roualdes commented on July 4, 2024

I couldn't quite follow all the callbacks

Ya, sorry about that. I'm struggling to explain this fairly complicated process. Thankfully, Brian is experienced at filling in the gaps between what Edward says and what Edward means.

Skipping the details, there is nothing clients can't achieve now. There is opportunity, as I see it, to enable calling BridgeStan both more simply and faster in a very specific use case.

from bridgestan.

bob-carpenter avatar bob-carpenter commented on July 4, 2024

Got it. Thanks!

from bridgestan.

WardBrian avatar WardBrian commented on July 4, 2024

I'm not sure if this is an issue outside of Python, since it really stems from numpy being ubiquitous but not actually in the standard library.

That said, the following snippet of code only uses our public API, so there is nothing actually requiring this to live inside BridgeStan to unlock the same functionality:

model = bridgestan.StanModel(...)
ctypes_log_density_gradient = model.stanlib["bs_log_density_gradient"]
# set up metadata as Edward does above, using ctypes.POINTER(ctypes.c_double)

The only thing you really lose here is the Python wrappings around the exceptions, you'd need to manually check the return codes etc. OTOH, if you're writing something that calls this from inside other compiled code, then that may actually be how you'd like things to be set up

from bridgestan.

roualdes avatar roualdes commented on July 4, 2024

Right, this doesn't seem to be an issue outside of Python, for the reasons you mention. Good call.

Brian proposed some code, in branch python/allow-ctypes-double-pointers, that allows direct access to BridgeStan's log_density_gradient() for both numpy arrays and ctypes. The solution involves an extra if statement, relative to what lives in main branch. To help us determine whether or not we want to include such functionality in BridgeStan, I ran some experiments and report the results below.

I ran some numerical simulations to better understand the computational cost trade-offs associated with this extra if statement. From a numpy array user perspective, we expect a slight slow down (one extra if statement). From a ctypes user perspective, we expect a major gain (no unnecessary numpy array creation). The code used for the simulations lives in the github repository bridgestan-speed. The simulations were across three models: gaussian with data (2 parameters), logistic regression (25 parameters), and standard normal with no data (1_000 parameters).

Numpy array perspective. 500 runs of 1_000 evaluations of log_density_gradient
across two branches. Times reported are means per run plus/minus standard
deviations.

dimensions / branch             main                    python/allow-ctypes-double-pointers
2 (normal w/ data)              13.9 ms +/- 40.9 μs     14.6 ms +/- 493 μs
25 (logistic regression)        96.2 ms +/- 8.42 ms     82.7 ms +/- 1.13 ms
1_000 (std_normal no data)      27.5 ms +/- 1.48 ms     27.1 ms +/- 721 μs

Ctypes perspective. 20 runs of Stan sampling with 1_000 warmup and 1_000
iterations across two branches. Times reported are means per Stan run
plus/minus standard deviations. RNG seeds were coordinated to ensure no branch
lucked into a worse part of the model parameter space than the other branch.


dimensions / branch             main                    python/allow-ctypes-double-pointers
2 (normal w/ data)              251 ms +/- 7.36 ms      89.9 ms +/- 5.3 ms
25 (logistic regression)        2.56 s +/- 132 ms       2 s +/- 74.9 ms
1_000 (std_normal no data)      1.51 s +/- 30.3 ms      908 ms +/- 92 ms

What do you all think, should we add this functionality to BridgeStan?

from bridgestan.

bob-carpenter avatar bob-carpenter commented on July 4, 2024

Thanks for the report @roualdes. I couldn't quite tell what was being compared. I see two tables of results, but don't know what they mean. What does "across two branches" mean for the "Numpy array perspective"? Is it that the first one is passing in an numpy array and the second passing in some raw memory?

from bridgestan.

roualdes avatar roualdes commented on July 4, 2024

What does "across two branches" mean for the "Numpy array perspective"?

The first table is meant to show costs associated with the proposed solution to this issue for the most common Python/BridgeStan user, a user who is only dealing with numpy arrays. The table presents mean run times for calling BridgeStan's log_density_gradient with numpy arrays. The first column sets the baseline for run times in the branch main. The second column displays run times for the proposed solution from branch python/allow-ctypes-double-pointers.

Is it that the first one is passing in an numpy array and the second passing in some raw memory?

Correct. The second table is meant to show benefits associated with the proposed solution when a user has a double* to raw memory.

Let me know if I can offer any other words to help clarify what my results are about.

from bridgestan.

bob-carpenter avatar bob-carpenter commented on July 4, 2024

Thanks---that's what I suspected and this looks close enough to be worth doing.

If the results weren't clear enough here, I'd be tempted to do about 10 times as many evals over several batches because the inconsistent sd in the tests you list (e.g., "13.9 ms +/- 40.9 μs 14.6 ms +/- 493 μs" in the first line of the first table, where the new method has ten times the standard deviation). This kind of micro-benchmarking is notoriously tricky, so probably not worth it.

from bridgestan.

WardBrian avatar WardBrian commented on July 4, 2024

They align with my expectation - for numpy arrays it is a wash and for raw pointers the overhead is much less.

If it had ended up being a significant increase for numpy arrays I would be more hesitant toward adding it. However, it's really just adding one more if into a part of the code that already has a bunch of ifs for validation logic, so I'm not shocked the impact is essentially nothing.

from bridgestan.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.