Coder Social home page Coder Social logo

Comments (6)

yixuan avatar yixuan commented on June 26, 2024 1

Yeah, I'm thinking of how to refactor the code and properly return from the line search routine. I'll update here once I have any progress.

from lbfgspp.

yixuan avatar yixuan commented on June 26, 2024

Hi @izsahara, do you have any test cases to compare the results?

from lbfgspp.

cschreib-ibex avatar cschreib-ibex commented on June 26, 2024

I just went through this trouble myself, trying to reproduce the scipy implementation as close as possible. Here are the parameters that (I think) map to the scipy defaults:

LBFGSBParam<double> params;

params.epsilon = 1e-5;
params.epsilon_rel = 0.0;
params.delta = 2.22045e-9;
params.past = 1;

params.m = 10;
params.max_iterations = 15000;

params.max_linesearch = 20;
// Less sure about these ones:
params.ftol = 1e-3;
params.wolfe = 0.9;
params.min_step = 0.0;
params.max_step = std::numeric_limits<double>::infinity();
// Edit: update this to zero following comments below.
params.max_submin = 0; 

There's more to that, though. The Fortran code used by scipy doesn't generate an error when reaching the maximum number of steps in the line search; it just reverts to some previous state which, as far as I can piece together by reading their code, is the state at the start of the line search. So, I think, it means it will just stop as converged, rather than error. Does that seem right to you?

So I went into LineSearchMoreThuente.h and did that, at the very end of the function:

        if (iter >= param.max_linesearch)
        {
            // Original code:
            // throw std::runtime_error("the line search routine reached the maximum number of iterations");

            // Behavior closer to scipy: restore input state.
            x = xp;
            fx = fx_init;
            dg = dg_init;
            grad = grad_init;
            step = Scalar(0);
        }

(you'll need to store a copy grad_init of the input gradient at the start, as well)

I'd be interested in hearing your opinion @yixuan, since you must know the code and algorithm better than any of us.

from lbfgspp.

yixuan avatar yixuan commented on June 26, 2024

Thanks @cschreib-ibex for the careful investigation. Indeed, these parameters can somehow be mapped to the Scipy default, but from my own experience, it is pretty hard to exactly reproduce the backend Fortran code. The difference is most likely to happen in the line search routine.

For the params.max_submin parameter, it is something I'm experimenting to improve the L-BFGS-B algorithm, and you can set it to zero to better match the Scipy implementation.

from lbfgspp.

cschreib-ibex avatar cschreib-ibex commented on June 26, 2024

Thank you, will do that! What about the "max linesearch" behaviour; do you think it makes sense to revert to initial values rather than error? This appears to be a common occurence in our data sets when the algorithm is close to converging.

from lbfgspp.

yixuan avatar yixuan commented on June 26, 2024

This commit has removed major runtime exceptions when maximum number of line searches is reached, which should partly resolve the "max line search" issue.

There remain some exceptions in the code, which are mostly "serious" issues that are hard to proceed.

Let me first mark this as closed. Feel free to reopen if further issues occur.

from lbfgspp.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.