Coder Social home page Coder Social logo

A vector parameter as the first parameter in a Stan model leads to the first two parameters having a zero gradient if make/local contains `STANCFLAGS+=--O1`. about bridgestan HOT 10 CLOSED

roualdes avatar roualdes commented on May 24, 2024
A vector parameter as the first parameter in a Stan model leads to the first two parameters having a zero gradient if make/local contains `STANCFLAGS+=--O1`.

from bridgestan.

Comments (10)

WardBrian avatar WardBrian commented on May 24, 2024

Confirmed this, looking into it more

from bridgestan.

WardBrian avatar WardBrian commented on May 24, 2024

Okay, I compiled the same model with CmdStan and then ran ./grad diagnose init=p.json where p.json was { "fvb": [1,2,3,4,5] }, and this output the correct values for the gradient.

So, it seems like this is an issue only in BridgeStan somewhere

from bridgestan.

WardBrian avatar WardBrian commented on May 24, 2024

Additional findings:

Adding dummy real parameters before the vector changes this behavior. With the first, only one zero appears in the gradient. With the second and onward, the gradient is correct.

Whenever the gradient is wrong, calling log_density_hessian seems to cause a segfault in stan::math::internal::reverse_pass_callback_vari.

The only changes to the model's .hpp file with --O1 are the introduction of some conditional_var_value_ts to the calls to the deserializer's read - indicated SoA is being used.

from bridgestan.

SteveBronder avatar SteveBronder commented on May 24, 2024

Just posting, this isn't a Stan level error as I wrote a test here to see what's going on. I think it's something in bridgestan

from bridgestan.

roualdes avatar roualdes commented on May 24, 2024

And the plot thickens. Thanks all for looking into this. Much appreciated.

I had to add set_cmdstan_path!(...) to get this MWE to run, but then with or without --O1, this gives output (-27.5, [-1.0, -2.0, -3.0, -4.0, -5.0]) for me.

I compiled the model from the command line and not, and confirmed that the .hpp file contains return std::vector<std::string>{"stanc_version = stanc3 8fce5fb", "stancflags = --O1"};.

macOS 12.2, Apple clang version 11.0.3, Julia 1.8.2, and I honestly don't know how to check my CmdStan version.

Has anyone narrowed this down to a specific higher level language? Does this happen in Python or R, for those of you that can recreate this?

from bridgestan.

WardBrian avatar WardBrian commented on May 24, 2024

@SteveBronder and I have been able to recreate this in the C example. We seem to have narrowed it down to the model functor, since replacing that with a lambda has resolved it for me

from bridgestan.

bob-carpenter avatar bob-carpenter commented on May 24, 2024

Thanks, @SteveBronder. I couldn't quite follow the test. Is it using a new gradient functional that takes array arguments?

@roualdes or @WardBrian: Is that what is getting called to cause the error in BridgeStan?

Either way, I don't see how the -O1 vs -O0 could affect the soundness of the BridgeStan interface unless there are threading or memory issues somewhere introduced by the optimizations or in the interface.

from bridgestan.

WardBrian avatar WardBrian commented on May 24, 2024

The problem is with model_functor. We're not sure exactly what is wrong, but it appears something is becoming nullpointer too soon in the autodiff stack. Replacing it with a more standard lambda resolves the issue, so I should have a PR ready soon.

from bridgestan.

WardBrian avatar WardBrian commented on May 24, 2024

For future reference if it ever comes up again, here is the code Steve and I were using to debug with gdb:

#include "bridgestan.h"
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char** argv) {
  char* data;
  if (argc > 1) {
    data = argv[1];
  } else {
    data = "";
  }
  model_rng* model = construct(data, 123, 0);
  if (!model) {
    return 1;
  }
  printf("This model's name is %s.\n", name(model));
  printf("It has %d parameters.\n", param_num(model, 0, 0));

  double params[5] = {1.0, 2.0, 3.0, 4.0, 5.0};
  double val;
  double* grad = malloc(5 * sizeof(double));
  log_density_gradient(model, 1, 1, params, &val, grad);
  printf("%f %f %f %f %f", grad[0], grad[1], grad[2], grad[3], grad[4]);
  
  return destruct(model);
}

The easiest way to use this is to replace example.c with it in the c-example folder and then link to the model provided in the original issue here.

from bridgestan.

SteveBronder avatar SteveBronder commented on May 24, 2024

Thanks, @SteveBronder. I couldn't quite follow the test. Is it using a new gradient functional that takes array arguments?

@bob-carpenter that function was just the log prob gradient call that bridgestan uses at the C level

from bridgestan.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.