Coder Social home page Coder Social logo

yixuan / lbfgspp Goto Github PK

View Code? Open in Web Editor NEW
510.0 17.0 100.0 902 KB

A header-only C++ library for L-BFGS and L-BFGS-B algorithms

Home Page: https://lbfgspp.statr.me/

License: MIT License

C++ 98.34% CMake 1.66%
optimization lbfgs-solver lbfgsb-solver l-bfgs l-bfgs-b limited-memory

lbfgspp's Introduction

LBFGS++ LBFGS++

UPDATE on 2020-03-06: LBFGS++ now includes a new L-BFGS-B solver for box-constrained optimization problems. Check the example below for its usage.

LBFGS++ is a header-only C++ library that implements the Limited-memory BFGS algorithm (L-BFGS) for unconstrained minimization problems, and a modified version of the L-BFGS-B algorithm for box-constrained ones.

The code for the L-BFGS solver is derived and modified from the libLBFGS library developed by Naoaki Okazaki.

LBFGS++ is implemented as a header-only C++ library, whose only dependency, Eigen, is also header-only.

A Quick Example

To use LBFGS++, one needs to first define a functor to represent the multivariate function to be minimized. It should return the objective function value on a vector x and overwrite the vector grad with the gradient evaluated on x. For example we could define the Rosenbrock function in the following way:

#include <Eigen/Core>
#include <iostream>
#include <LBFGS.h>

using Eigen::VectorXd;
using namespace LBFGSpp;

class Rosenbrock
{
private:
    int n;
public:
    Rosenbrock(int n_) : n(n_) {}
    double operator()(const VectorXd& x, VectorXd& grad)
    {
        double fx = 0.0;
        for(int i = 0; i < n; i += 2)
        {
            double t1 = 1.0 - x[i];
            double t2 = 10 * (x[i + 1] - x[i] * x[i]);
            grad[i + 1] = 20 * t2;
            grad[i]     = -2.0 * (x[i] * grad[i + 1] + t1);
            fx += t1 * t1 + t2 * t2;
        }
        return fx;
    }
};

Then we just need to set up parameters, create solver object, provide initial guess, and then run the minimization function.

int main()
{
    const int n = 10;
    // Set up parameters
    LBFGSParam<double> param;
    param.epsilon = 1e-6;
    param.max_iterations = 100;

    // Create solver and function object
    LBFGSSolver<double> solver(param);
    Rosenbrock fun(n);

    // Initial guess
    VectorXd x = VectorXd::Zero(n);
    // x will be overwritten to be the best point found
    double fx;
    int niter = solver.minimize(fun, x, fx);

    std::cout << niter << " iterations" << std::endl;
    std::cout << "x = \n" << x.transpose() << std::endl;
    std::cout << "f(x) = " << fx << std::endl;

    return 0;
}

The example can then be compiled and run.

$ g++ -I/path/to/eigen -I/path/to/lbfgspp/include -O2 example.cpp
$ ./a.out
23 iterations
x =
1 1 1 1 1 1 1 1 1 1
f(x) = 1.87948e-19

You can also use a different line search algorithm by providing a second template parameter to LBFGSSolver. For example, the code below illustrates the bracketing line search algorithm (contributed by @DirkToewe).

int main()
{
    const int n = 10;
    // Set up parameters
    LBFGSParam<double> param;
    param.epsilon = 1e-6;
    param.max_iterations = 100;

    // Create solver and function object
    LBFGSSolver<double, LineSearchBracketing> solver(param);
    Rosenbrock fun(n);

    // Initial guess
    VectorXd x = VectorXd::Zero(n);
    // x will be overwritten to be the best point found
    double fx;
    int niter = solver.minimize(fun, x, fx);

    std::cout << niter << " iterations" << std::endl;
    std::cout << "x = \n" << x.transpose() << std::endl;
    std::cout << "f(x) = " << fx << std::endl;

    return 0;
}

Box-constrained Problem

If the parameters to be optimized have simple bounds, then the L-BFGS-B solver class LBFGSBSolver can be used. The code is very similar to that of LBFGSSolver. Below is the same Rosenbrock example, but we require that all variables should be between 2 and 4.

#include <Eigen/Core>
#include <iostream>
#include <LBFGSB.h>  // Note the different header file

using Eigen::VectorXd;
using namespace LBFGSpp;

class Rosenbrock
{
private:
    int n;
public:
    Rosenbrock(int n_) : n(n_) {}
    double operator()(const VectorXd& x, VectorXd& grad)
    {
        double fx = 0.0;
        for(int i = 0; i < n; i += 2)
        {
            double t1 = 1.0 - x[i];
            double t2 = 10 * (x[i + 1] - x[i] * x[i]);
            grad[i + 1] = 20 * t2;
            grad[i]     = -2.0 * (x[i] * grad[i + 1] + t1);
            fx += t1 * t1 + t2 * t2;
        }
        return fx;
    }
};

int main()
{
    const int n = 10;
    // Set up parameters
    LBFGSBParam<double> param;  // New parameter class
    param.epsilon = 1e-6;
    param.max_iterations = 100;

    // Create solver and function object
    LBFGSBSolver<double> solver(param);  // New solver class
    Rosenbrock fun(n);

    // Bounds
    VectorXd lb = VectorXd::Constant(n, 2.0);
    VectorXd ub = VectorXd::Constant(n, 4.0);

    // Initial guess
    VectorXd x = VectorXd::Constant(n, 3.0);

    // x will be overwritten to be the best point found
    double fx;
    int niter = solver.minimize(fun, x, fx, lb, ub);

    std::cout << niter << " iterations" << std::endl;
    std::cout << "x = \n" << x.transpose() << std::endl;
    std::cout << "f(x) = " << fx << std::endl;

    return 0;
}

Note that we also allow infinite values for the lower and upper bounds. In such cases one can define ub[i] = std::numeric_limits<double>::infinity(), for example.

Documentation

The API reference page contains the documentation of LBFGS++ generated by Doxygen.

License

LBFGS++ is an open source project under the MIT license.

lbfgspp's People

Contributors

dirktoewe avatar mpayrits avatar pjknowles avatar steinmig avatar yixuan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

lbfgspp's Issues

Line Search throws increasing search direction exception

Hi,

I was testing the LBFGSB code with the default LineSearchMoreThuente and it throws with throw std::logic_error("the moving direction does not decrease the objective function value") (on line 198 in LineSearchMoreThuente.h). I think it is because dg_init is computed using the gradient instead of the projected gradient? I attached the test case.

main.txt

Why the stop criteria is like Gradient Norm< epsilon_relative * x.norm()

First, thanks author for sharing such a great source code. Recently, I learn the LBFG and traced the code of this library. But, I really don't know why one of the stop condition is like:
m_gnorm <= m_param.epsilon_rel * x.norm()
Why the norm of gradient is associated with the norm of parameter? Could anyone provide the keyword or explain it? Thank you.

Bounded problems

Do you plan to also include bounded problems, i.e. implement lbfgs-b?

linesearch fails

When the initial x in the Rosenbrock example is changed from zeros to random, the linesearch fails.
VectorXf x = VectorXf::Random(n);

The linesearch fails the sufficient decrease condition.

I ran into this trying to use this BFGS optimizer on a neural net. Reproduced it with the included Rosenbrock example. I haven't figured out the problem, but was wondering if you know.

LineSearchNocedalWright - NaN in Step 2 (Zooming)

In LineSearchNocedalWright.h (line 145, 146):
step = (fx_hi - fx_lo) * step_lo - (step_hi * step_hi - step_lo * step_lo) * dg_lo / 2;
step /= (fx_hi - fx_lo) - (step_hi - step_lo) * dg_lo;

Step is not checked against division by zero.

Failure to converge (again...)

Hi, sorry again. I'm having trouble with convergence, again using the code you helped me to fix. I have tested the code with larger problems successfully, but I have just found a small case where it fails.

It looks relatively simple, but the minimization process fails due to the line search step becoming too small after 2 iterations. Small variations in the setup make the problem solvable (for example, modifying the initial guess from 1.0, 1.0 to 1.1, 1.1; or by adding more points to the training set or modifying some of them).

Is this type of failure a common occurrence with this type of optimizers (as you may have guessed I'm not really an expert in this field)? Using scipy, with the same algorithm, does not seem to usually have problems. Should I just put some try/catch blocks around the optimizer, and if it fails try again with different parameters (which I should do anyway to ensure a global min is found, but I'm asking specifically about errors)? Or maybe it's just a bug and I can keep reporting them :)

Thanks again in advance for your help.

#include <iostream>
#include <Eigen/Eigenvalues>

#include <LBFGSB.h>

constexpr double PI = 3.141592653589793238462643383279502884197169399375105820974944;

using Matrix2D = Eigen::Matrix<double, Eigen::Dynamic, Eigen::Dynamic, Eigen::RowMajor | Eigen::AutoAlign>;
using Vector = Eigen::Matrix<double, Eigen::Dynamic, 1>;

Matrix2D makeDist(const Matrix2D & x1, const Matrix2D & x2) {
    return ((-2.0 * (x1 * x2.transpose())).colwise() + x1.rowwise().squaredNorm()).rowwise() + x2.rowwise().squaredNorm().transpose();
}

Matrix2D kernel(const Matrix2D & dist, double l = 1.0, double sigma = 1.0) {
    return std::pow(sigma, 2) * ((-0.5 / std::pow(l, 2)) * dist).array().exp();
}

struct Likelihood {
    Likelihood(const Matrix2D & xData, const Vector & muData, double noise) :
            xData_(xData), muData_(muData), noise_(noise) {}

    const Matrix2D & xData_;
    const Vector & muData_;
    double noise_;

    double operator()(const Eigen::VectorXd& x, Eigen::VectorXd& grad) {
        const auto l = x[0];
        const auto sigma = x[1];

        auto dist = makeDist(xData_, xData_);
        auto k = kernel(dist, l, sigma);

        Matrix2D dkl = k.cwiseProduct(dist / std::pow(l, 3));
        Matrix2D dks = (2.0 / sigma) * k;

        k.diagonal().array() += std::pow(noise_, 2);

        auto L = k.llt();
        Matrix2D kinv = L.solve(Matrix2D::Identity(muData_.size(), muData_.size()));
        Vector alpha = L.solve(muData_);

        // Negative gradient to minimize negative log-likelihood
        grad[0] = -(0.5 * alpha.transpose() * dkl * alpha - 0.5 * kinv.cwiseProduct(dkl).sum());
        grad[1] = -(0.5 * alpha.transpose() * dks * alpha - 0.5 * kinv.cwiseProduct(dks).sum());

        // Log-likelihood
        const double ll = -0.5 * muData_.transpose() * alpha - L.matrixL().selfadjointView().diagonal().array().log().sum() - 0.5 * muData_.size() * std::log(2.0 * PI);
        std::cout << l << ", " << sigma << " ==> " << -ll << "  -  " << grad.transpose() << std::endl;
        return -ll;
    }
};

int main() {
    const double noise = 1e-8;

    // Training points to test on. Small modifications seem to make the problem
    // solvable again (for example replacing the 0 with a 1).
    Matrix2D X_train(7, 1);
    X_train << -4, -3, -2, -1, 0, 2, 4;

    Vector Y_train = X_train.array().sin().array();

    // Log-likelihood function with set data/noise
    Likelihood like(X_train, Y_train, noise);

    // Optimizer params
    LBFGSpp::LBFGSBParam<double> param;
    param.epsilon = 1e-5;
    param.max_iterations = 100;

    // Bounds
    Vector lb = Vector::Constant(2, 1e-8);
    Vector ub = Vector::Constant(2, std::numeric_limits<double>::infinity());

    // Initial guess (replacing 1.0 with other numbers seem to make the problem
    // solvable as well).
    Vector x = Vector::Constant(2, 1.0);

    LBFGSpp::LBFGSBSolver<double> solver(param);

    double log;
    solver.minimize(like, x, log, lb, ub);

    return 0;
}

The output is:

1, 1 ==> 6.96291  -  -3.03909  3.49729
1.9499, 0.68744 ==> 5.99037  -   4.90625 -14.3289
1.28814, 0.905188 ==> 5.76892  -  -3.16873  2.10955
terminate called after throwing an instance of 'std::runtime_error'
  what():  the line search step became smaller than the minimum value allowed
Aborted (core dumped)

If the starting point [1.1, 1.1] is used, it works:

1.1, 1.1 ==> 6.99387  -  -3.4496 3.58888
2.05273, 0.796194 ==> 5.19971  -   4.32208 -9.35253
1.46782, 0.982711 ==> 5.34336  -  -3.34543  2.06644
6.15734, 0.336145 ==> 1.88086e+06  -   3.82257e+06 -1.11909e+07
2.89361, 0.78613 ==> 39.5965  -   131.639 -109.667
1.86985, 0.927282 ==> 4.3294  -  -0.92675 -1.49038
2.08611, 0.992819 ==> 4.19746  -   1.20536 -2.77274
2.10575, 1.08705 ==> 4.02656  -  0.314624 -1.26704
2.18967, 1.21189 ==> 3.94199  -   0.334645 -0.571357
2.20651, 1.27356 ==> 3.92379  -  0.00729654  -0.127709
2.22573, 1.30522 ==> 3.92164  -   0.0325654 -0.0340186
2.2261, 1.30993 ==> 3.92157  -  -0.00439909  0.00135147
2.22656, 1.31028 ==> 3.92156  -   0.000116421 -1.42964e-05
2.22655, 1.31027 ==> 3.92156  -  -3.84116e-07  4.18181e-07

Same if the starting point [0.9, 0.9] is used.

0.9, 0.9 ==> 6.90955  -  -2.61983  3.17243
1.84575, 0.575104 ==> 7.52613  -   6.48836 -24.6506
1.37833, 0.735678 ==> 5.43282  -  -2.02235 -1.69158
2.20017, 0.694899 ==> 8.04254  -    13.791 -24.1841
1.6203, 0.723671 ==> 5.12384  -  -0.641197  -4.86297
1.95251, 0.837837 ==> 4.62563  -   1.38858 -5.23043
2.05031, 1.00839 ==> 4.13246  -  0.396067 -1.95868
2.14692, 1.12796 ==> 3.9945  -  0.522762 -1.14357
2.16689, 1.23065 ==> 3.9341  -  -0.221894 -0.153314
2.23342, 1.30105 ==> 3.9229  -   0.202664 -0.141067
2.22493, 1.30545 ==> 3.92161  -   0.0169154 -0.0246754
2.22623, 1.30971 ==> 3.92157  -  -0.000161013  -0.00168692
2.22654, 1.31025 ==> 3.92156  -  -9.30085e-05  9.63869e-06
2.22655, 1.31026 ==> 3.92156  -  -1.78201e-06  9.67718e-07

Division by zero in BFGSMat.h (apply_Hv(...))

Hello, first of all, thanks for this amazing library.

As in issue #18, I found out that division by zero could sometimes happen while working with the library. In my case, it happened during the call of apply_Hv(...) in BFGSMat.h due to zero scalar values (m_ys and/or m_theta ) set in add_correction of the same file. I have to admit that I do not know if it is "mathematically/theoretically" possible to have m_ys and/or m_theta set to zero. It happened when I redid the procedure of the article DOI: 10.1109/TVCG.2018.2810279.

Anyway, I fixed it, on my side, as you did in issue #18:

const Scalar eps = std::numeric_limits<Scalar>::epsilon();
m_ys[loc] = ys;
if (m_ys[loc] < eps) m_ys[loc] = eps;

m_theta = m_y.col(loc).squaredNorm() / m_ys[loc];
if (m_theta < eps) m_theta = eps;

Uninitialized matrices lead to undefined behaviour

While playing with the example code you helped me fix in #7, I started to get randomly erratic behavior while changing lines of code completely unrelated to the optimizer. Sometimes by simply adding or removing a line of code I can consistently reproduce the optimizer going into a completely different direction, and subsequently throwing due to some unfulfilled condition.

Since in my experience these kind of problems are usually reserved for uninitialized values, I got curious and compiled my example in debug mode, and running it through valgrind --leak-check=full --track-origins=yes. Valgrind found many cases of conditional jumps being performed depending on uninitialized variables, which confirmed my suspicions.

The source seems to be the various matrix resizing operation in the reset() functions of the optimizer/matrices(BGFSMat)/etc. I'm not sure if you are aware, but when resizing, Eigen does not automatically zero the matrices. I have fixed the problems by calling, after each resize, setZero(), but I am not sure whether this is OK or if you are explicitly avoiding the initialization because you want to do it in another place in the code to be more efficient.

An example of the change I have made to LBFGSBSolver to fix the problem:

    inline void reset(int n)
    {
        const int m = m_param.m;
        m_bfgs.reset(n, m);
        m_xp.resize(n);
        m_xp.setZero();
        m_grad.resize(n);
        m_grad.setZero();
        m_gradp.resize(n);
        m_gradp.setZero();
        m_drt.resize(n);
        m_drt.setZero();
        if(m_param.past > 0) {
            m_fx.resize(m_param.past);
            m_fx.setZero();
        }
    }

I am attaching the report here in case you need it, as it is fairly long (~5000 lines). Please let me know if I can help you more.

valgrind_report.txt

LBFGS Breaks on this trivial case

Dear oh dear....

OUTPUT

terminate called after throwing an instance of 'std::runtime_error'
  what():  the line search routine reached the maximum number of iterations
Aborted (core dumped)

CODE

#include <Eigen/Core>
#include <iostream>
#include <LBFGS.h>

using Eigen::VectorXf;
using Eigen::MatrixXf;
using namespace LBFGSpp;

class Rosenbrock
{
private:
    int n;
public:
    Rosenbrock(int n_) : n(n_) {}
    float operator()(const VectorXf& x, VectorXf& grad)
    {
        float out = 0;
        for (int i = 0; i < n-1; ++i) {
            out += 100 * pow(x[i + 1] - pow(x[i], 2), 2) + pow(1 - x[i], 2);
        }

        grad[0] = -400 * x[0] * (x[1] - pow(x[0], 2)) - 2 * (1 - x[0]);
        grad[n - 1] = 200 * (x[n - 1] - pow(x[n - 2], 2));
        for (int i = 1; i < n - 1; ++i) {
            float xm = x[i];
            grad[i] = 200 * (xm - pow(x[i - 1], 2)) - 400 * (x[i + 1] - pow(xm, 2)) * xm - 2 * (1 - xm);
        }
        return out;
    }
};

int main()
{
    const int n = 5;
    LBFGSParam<float> param;
    LBFGSSolver<float> solver(param);
    Rosenbrock fun(n);

    VectorXf x = VectorXf::Zero(n);
    x << 1.3, 0.7, 0.8, 1.9, 1.2;//, 0.5, 1.3, 1.5, 1.2, 0.9;

    float fx;
    int niter = solver.minimize(fun, x, fx);

    std::cout << niter << " iterations" << std::endl;
    std::cout << "x = \n" << x.transpose() << std::endl;
    std::cout << "f(x) = " << fx << std::endl;

    return 0;
}

Bounded LBFGS (box) doesn't work well

The implementation of LBFGS-B doesn't work well.
In this case the bounds actually causes more problems.
I've arbitrarily set the lower bound to be 1.1, which is higher than the optimum 1.
But the output just gets worse with each n.

OUTPUT

18 iterations
x = 
    1.1 1.11349 1.19703 1.41582       2  // why does it keep increasing with each index? The optimal output should be 1.1 for all
f(x) = 1.38064

CODE

#include <Eigen/Core>
#include <iostream>
#include <LBFGSB.h>

using namespace LBFGSpp;
using Eigen::VectorXf;

typedef double Scalar;
typedef VectorXf Vector;

class Rosenbrock
{
private:
    int n;
public:
    Rosenbrock(int n_) : n(n_) {}
    float operator()(const VectorXf& x, VectorXf& grad)
    {
        float out = 0;
        for (int i = 0; i < n-1; ++i) {
            out += 100 * pow(x[i + 1] - pow(x[i], 2), 2) + pow(1 - x[i], 2);
        }

        grad[0] = -400 * x[0] * (x[1] - pow(x[0], 2)) - 2 * (1 - x[0]);
        grad[n - 1] = 200 * (x[n - 1] - pow(x[n - 2], 2));
        for (int i = 1; i < n - 1; ++i) {
            float xm = x[i];
            grad[i] = 200 * (xm - pow(x[i - 1], 2)) - 400 * (x[i + 1] - pow(xm, 2)) * xm - 2 * (1 - xm);
        }
        return out;
    }
};

int main()
{
    const int n = 5;
    LBFGSBParam<float> param;
    LBFGSBSolver<float> solver(param);
    Rosenbrock fun(n);

    // Variable bounds
    Vector lb = Vector::Constant(n, 1.1);
    Vector ub = Vector::Constant(n, 2.0);
    Vector x = Vector::Constant(n, 0);
    x << 1.3, 0.7, 0.8, 1.9, 1.2;//, 0.5, 1.3, 1.5, 1.2, 0.9;

    float fx;
    int niter = solver.minimize(fun, x, fx, lb, ub);

    std::cout << niter << " iterations" << std::endl;
    std::cout << "x = \n" << x.transpose() << std::endl;
    std::cout << "f(x) = " << fx << std::endl;

    return 0;
}

Line Search failed using LBFGS for optimisation of chemical structures

Hey,

I am using this nice header-only LBFGS implementation to provide a geometry optimisation method in my molecular modelling program (https://github.com/conradhuebler/curcuma).

In order to have more control over the optimisation with respect to my quantities of interest, I made some changes to allow single steps during the minimisation and yesterday I merged the current upstream repo into my fork. However, now the line search and the optimisation fail.

As I merged the repos and need my personal changes, I can not track down the specific commit, which makes it fail.

So can I tweak some settings to recover the behaviour of the linesearch before:
commit 563106b
Author: Yixuan Qiu [email protected]
Date: Fri May 27 09:13:20 2022 +0800
?

The problem is, that either the numer of iterations in the line search is to low (thousends are not enough) or it might take to long for no positive outcome.

Is there something I could try?
Thanks and thanks for making LBFGSpp,
Conrad

Some compiler warnings.

modules/lbfgs/thirdparty/../thirdparty/LBFGSpp/include/LBFGSpp/BKLDLT.h: In member function 'int LBFGSpp::BKLDLT<Scalar>::solve_inplace(LBFGSpp::BKLDLT<Scalar>::GenericVector) const':
Error: modules/lbfgs/thirdparty/../thirdparty/LBFGSpp/include/LBFGSpp/BKLDLT.h:523:20: error: declaration of 'i' shadows a previous local [-Werror=shadow]
  523 |         for (Index i = npermc - 1; i >= 0; i--)
      |                    ^
modules/lbfgs/thirdparty/../thirdparty/LBFGSpp/include/LBFGSpp/BKLDLT.h:507:15: note: shadowed declaration is here
  507 |         Index i = (m_perm[m_n - 1] < 0) ? (m_n - 3) : (m_n - 2);
      |               ^
In file included from modules/lbfgs/thirdparty/../thirdparty/LBFGSpp/include/LBFGSB.h:14,
                 from modules/lbfgs/thirdparty/lbfgsbpp.h:39,
                 from modules/lbfgs/register_types.cpp:33:
modules/lbfgs/thirdparty/../thirdparty/LBFGSpp/include/LBFGSpp/LineSearchMoreThuente.h: In static member function 'static int LBFGSpp::LineSearchMoreThuente<Scalar>::LineSearch(Foo&, const SolverParam&, const Vector&, const Vector&, const Scalar&, Scalar&, Scalar&, LBFGSpp::LineSearchMoreThuente<Scalar>::Vector&, Scalar&, LBFGSpp::LineSearchMoreThuente<Scalar>::Vector&)':
Error: modules/lbfgs/thirdparty/../thirdparty/LBFGSpp/include/LBFGSpp/LineSearchMoreThuente.h:415:30: error: declaration of 'ft' shadows a previous local [-Werror=shadow]
  415 |                 const Scalar ft = fx - fx_init - step * test_decr;
      |                              ^~
modules/lbfgs/thirdparty/../thirdparty/LBFGSpp/include/LBFGSpp/LineSearchMoreThuente.h:284:26: note: shadowed declaration is here
  284 |             const Scalar ft = fx - fx_init - step * test_decr;
      |                          ^~

How to create a functor for LBFGSpp given objective function and its gradient

Hi Prof. Yi Xuan,

Thank you for sharing this library. I don't know much about optimization and C++ functors. I am trying to convert an optimization problem from Python SciPy Optimize to C++ using your library. However, in Python, the input arguments to the minimize method are scalar objective function and its gradient among others. I have the analytic solution of the gradient and I have implemented it as a function that returns an Eigen vector. But I am struggling to convert these into a functor so that I can give it to LBFGSpp.

the line search routine reached the maximum number of iterations

Hi, thanks for writing this library! While using it, I encountered the following runtime error,

terminate called after throwing an instance of 'std::runtime_error' what(): the line search routine reached the maximum number of iterations Aborted (core dumped)

My objective function has 200 free parameters; admittedly, this is not a very easy task. However, I wrote python prototype before for this task and used scipy.optimize with LFBG method and it works fine. I also checked my gradient against numerically computed gradient and they agree. What confuses me a bit is that, after seeing this error, I set
param.max_iterations = 0;which, according to the doc, should allow the optim to continue until an error occurs or convergence. But I am still getting the same error saying maximum number of iterations reached. I wonder if something else is happening? Thanks a lot!

More Thuente line search can find proper step

Hi!

I've ported your fantastic library (L-BFGS-B only) to Java and starting to test against different cases. TBH, let's say, I understand most of the concepts but most of the stuff I copy&pasted without deeper understanding. Anyway, while testing my port and your library against number of targets - the results are consistent (successes and fails).

The case I work on now is the Bohachevsky 3 function .

 Scalar operator () (const Vector& in, Vector& grad) {
    Scalar x1 = in [0];
    Scalar x2 = in [1];

    grad [0] = 2*x1 + 0.3* std::sin (3.0*PI*x1 + 4.0*PI*x2)*3.0*PI;
    grad [1] = 4*x2 + 0.4* std::sin (3.0*PI*x1 + 4.0*PI*x2)*4.0*PI;
    
    return x1*x1+2.0*x2*x2-0.3*std::cos (3.0*PI*x1 + 4.0*PI*x2) + 0.3;
  }

lb is [-100,-100], ub is [100,100] and initial x is [-100,-100].

Line search algorithm fails to find proper step during sixth iteration (k=6) for x=[-1.17596, -1.02192], grad=[0.291111,0.61104] and drt=[-0.0207124, -0.0218626].

The step is getting lower and lower to reach either minimum step size or maximum number of iterations.

My investigation shows that at the beginning of line search, first wolfe condition is false, second is true, later second condition starts to be false, and much later first starts to be true, below is the debug from my java code (similar result is in your code).

So either More-Thuente search does something wrong, or Cauchy point is wrong, or something different. I can't figure it out unfortunately.

R optim works well for this case.

########## K = 6 ##########
---------- line search ----------
      xp: [   -1.175965   -1.021925]
       x: [   -1.175965   -1.021925]
      fx: 3.6649880842636957
    grad: [    0.291111    0.611040]
      dg: -0.019388518342929335
    step: 1.0
step_max: 4527.278072232203
     drt: [   -0.020712   -0.021863]
-> before wolfe and loop
test_decr: -1.9388518342929334E-6
test_curv: 0.017449666508636403
        x: [   -1.196677   -1.043787]
       fx: 3.6890159788215215
     grad: [   -0.491640   -0.794323]
       dg: 0.02754899968783143
wolfe cond 1: 3.6890159788215215 <= 3.6649861454118615 == false
wolfe cond 2: 0.02754899968783143 <= 0.017449666508636403 == false
-> entering loop
iter: 0
  ft: 0.02402983340966013
  gt: 0.027550938539665722
-- case 1, 0.02402983340966013 > 0.0
-- new_step: 0.1281449854770161
     x: [   -1.178619   -1.024726]
    fx: 3.666020639854685
  grad: [    0.220565    0.483856]
    dg: -0.015146782672016433
  wolfe cond 1: 3.666020639854685 <= 3.664987835809556 == false
  wolfe cond 2: 0.015146782672016433 <= 0.017449666508636403 == true
iter: 1
  ft: 0.0010328040451292847
  gt: -0.01514484382018214
-- case 1, 0.0010328040451292847 > 0.0
-- new_step: 0.018504805795125423
     x: [   -1.176348   -1.022329]
    fx: 3.6651090720284976
  grad: [    0.281510    0.593717]
    dg: -0.01881094795535289
  wolfe cond 1: 3.6651090720284976 <= 3.664988048385619 == false
  wolfe cond 2: 0.01881094795535289 <= 0.017449666508636403 == false
iter: 2
  ft: 1.2102364287850891E-4
  gt: -0.018809009103518595
-- case 1, 1.2102364287850891E-4 > 0.0
-- new_step: 0.002722426236016934
     x: [   -1.176021   -1.021984]
    fx: 3.6650053144570243
  grad: [    0.289711    0.608514]
    dg: -0.019304292377753038
  wolfe cond 1: 3.6650053144570243 <= 3.6649880789853144 == false
  wolfe cond 2: 0.019304292377753038 <= 0.017449666508636403 == false
iter: 3
  ft: 1.7235471709656006E-5
  gt: -0.019302353525918744
-- case 1, 1.7235471709656006E-5 > 0.0
-- new_step: 4.0163395100049855E-4
     x: [   -1.175973   -1.021933]
    fx: 3.664990613925348
  grad: [    0.290905    0.610668]
    dg: -0.01937610886411797
  wolfe cond 1: 3.664990613925348 <= 3.664988083484987 == false
  wolfe cond 2: 0.01937610886411797 <= 0.017449666508636403 == false
iter: 4
  ft: 2.5304403608734615E-6
  gt: -0.019374170012283677
-- case 1, 2.5304403608734615E-6 > 0.0
-- new_step: 5.927648068333564E-5
     x: [   -1.175966   -1.021926]
    fx: 3.66498845734526
  grad: [    0.291080    0.610985]
    dg: -0.019386687201646168
  wolfe cond 1: 3.66498845734526 <= 3.664988084148767 == false
  wolfe cond 2: 0.019386687201646168 <= 0.017449666508636403 == false
iter: 5
  ft: 3.731964926221225E-7
  gt: -0.019384748349811874
-- case 1, 3.731964926221225E-7 > 0.0
-- new_step: 8.749044727386978E-6
     x: [   -1.175965   -1.021925]
    fx: 3.6649881393236887
  grad: [    0.291106    0.611032]
    dg: -0.019388248079223333
  wolfe cond 1: 3.6649881393236887 <= 3.6649880842467324 == false
  wolfe cond 2: 0.019388248079223333 <= 0.017449666508636403 == false
iter: 6
  ft: 5.5076956110783376E-8
  gt: -0.01938630922738904
-- case 1, 5.5076956110783376E-8 > 0.0
-- new_step: 1.29134633845694E-6
     x: [   -1.175965   -1.021925]
    fx: 3.6649880923903435
  grad: [    0.291110    0.611039]
    dg: -0.019388478452564938
  wolfe cond 1: 3.6649880923903435 <= 3.664988084261192 == false
  wolfe cond 2: 0.019388478452564938 <= 0.017449666508636403 == false
iter: 7
  ft: 8.129151528361872E-9
  gt: -0.019386539600730644
-- case 1, 8.129151528361872E-9 > 0.0
-- new_step: 1.906011069354277E-7
     x: [   -1.175965   -1.021925]
    fx: 3.6649880854631744
  grad: [    0.291111    0.611040]
    dg: -0.019388512455164743
  wolfe cond 1: 3.6649880854631744 <= 3.6649880842633262 == false
  wolfe cond 2: 0.019388512455164743 <= 0.017449666508636403 == false
iter: 8
  ft: 1.1998482858618813E-9
  gt: -0.01938657360333045
-- case 1, 1.1998482858618813E-9 > 0.0
-- new_step: 2.813250173095632E-8
     x: [   -1.175965   -1.021925]
    fx: 3.664988084440737
  grad: [    0.291111    0.611040]
    dg: -0.019388517473902345
  wolfe cond 1: 3.664988084440737 <= 3.664988084263641 == false
  wolfe cond 2: 0.019388517473902345 <= 0.017449666508636403 == false
iter: 9
  ft: 1.7709603732982997E-10
  gt: -0.01938657862206805
-- case 1, 1.7709603732982997E-10 > 0.0
-- new_step: 4.152325452828433E-9
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842898264
  grad: [    0.291111    0.611040]
    dg: -0.019388518214661702
  wolfe cond 1: 3.6649880842898264 <= 3.6649880842636877 == false
  wolfe cond 2: 0.019388518214661702 <= 0.017449666508636403 == false
iter: 10
  ft: 2.6138703940608333E-11
  gt: -0.019386579362827408
-- case 1, 2.6138703940608333E-11 > 0.0
-- new_step: 6.128818099384239E-10
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842675513
  grad: [    0.291111    0.611040]
    dg: -0.019388518323996747
  wolfe cond 1: 3.6649880842675513 <= 3.6649880842636944 == false
  wolfe cond 2: 0.019388518323996747 <= 0.017449666508636403 == false
iter: 11
  ft: 3.8567708069396475E-12
  gt: -0.019386579472162453
-- case 1, 3.8567708069396475E-12 > 0.0
-- new_step: 9.047013223848551E-11
     x: [   -1.175965   -1.021925]
    fx: 3.664988084264264
  grad: [    0.291111    0.611040]
    dg: -0.0193885183401345
  wolfe cond 1: 3.664988084264264 <= 3.6649880842636957 == false
  wolfe cond 2: 0.0193885183401345 <= 0.017449666508636403 == false
iter: 12
  ft: 5.686095967899195E-13
  gt: -0.019386579488300207
-- case 1, 5.686095967899195E-13 > 0.0
-- new_step: 1.3359590455377545E-11
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842637788
  grad: [    0.291111    0.611040]
    dg: -0.019388518342516703
  wolfe cond 1: 3.6649880842637788 <= 3.6649880842636957 == false
  wolfe cond 2: 0.019388518342516703 <= 0.017449666508636403 == false
iter: 13
  ft: 8.307058450842152E-14
  gt: -0.01938657949068241
-- case 1, 8.307058450842152E-14 > 0.0
-- new_step: 1.9790395282910703E-12
     x: [   -1.175965   -1.021925]
    fx: 3.664988084263708
  grad: [    0.291111    0.611040]
    dg: -0.019388518342868165
  wolfe cond 1: 3.664988084263708 <= 3.6649880842636957 == false
  wolfe cond 2: 0.019388518342868165 <= 0.017449666508636403 == false
iter: 14
  ft: 1.2438334940221318E-14
  gt: -0.01938657949103387
-- case 1, 1.2438334940221318E-14 > 0.0
-- new_step: 2.92242071913514E-13
     x: [   -1.175965   -1.021925]
    fx: 3.664988084263697
  grad: [    0.291111    0.611040]
    dg: -0.019388518342920255
  wolfe cond 1: 3.664988084263697 <= 3.6649880842636957 == false
  wolfe cond 2: 0.019388518342920255 <= 0.017449666508636403 == false
iter: 15
  ft: 1.332834243627375E-15
  gt: -0.01938657949108596
-- case 1, 1.332834243627375E-15 > 0.0
-- new_step: 4.69847459607086E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 16
  ft: -4.439981133891729E-16
  gt: -0.01938657949109344
-- case 2, -0.0 > 0.0
-- new_step: 9.866796651748807E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342926087
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342926087 <= 0.017449666508636403 == false
iter: 17
  ft: -4.4389790728219423E-16
  gt: -0.019386579491091793
-- case 1, -4.4389790728219423E-16 > -4.439981133891729E-16
-- new_step: 5.79052036971028E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636957
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927485
  wolfe cond 1: 3.6649880842636957 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927485 <= 0.017449666508636403 == false
iter: 18
  ft: 1.122696104032337E-19
  gt: -0.01938657949109319
-- case 1, 1.122696104032337E-19 > -4.439981133891729E-16
-- new_step: 4.760785175092509E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 19
  ft: -4.439969052793686E-16
  gt: -0.01938657949109344
-- case 1, -4.439969052793686E-16 > -4.439981133891729E-16
-- new_step: 4.711640572134368E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 20
  ft: -4.439978581204045E-16
  gt: -0.01938657949109344
-- case 1, -4.439978581204045E-16 > -4.439981133891729E-16
-- new_step: 4.701256514140482E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 21
  ft: -4.439980594519034E-16
  gt: -0.01938657949109344
-- case 1, -4.439980594519034E-16 > -4.439981133891729E-16
-- new_step: 4.699062404232806E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 22
  ft: -4.439981019924436E-16
  gt: -0.01938657949109344
-- case 1, -4.439981019924436E-16 > -4.439981133891729E-16
-- new_step: 4.69859879756223E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 23
  ft: -4.4399811098109E-16
  gt: -0.01938657949109344
-- case 1, -4.4399811098109E-16 > -4.439981133891729E-16
-- new_step: 4.698500839332691E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 24
  ft: -4.4399811288035497E-16
  gt: -0.01938657949109344
-- case 1, -4.4399811288035497E-16 > -4.439981133891729E-16
-- new_step: 4.6984801410871304E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 25
  ft: -4.4399811328166325E-16
  gt: -0.01938657949109344
-- case 1, -4.4399811328166325E-16 > -4.439981133891729E-16
-- new_step: 4.6984757674434593E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 26
  ft: -4.439981133664617E-16
  gt: -0.01938657949109344
-- case 1, -4.439981133664617E-16 > -4.439981133891729E-16
-- new_step: 4.698474842357829E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 27
  ft: -4.439981133843978E-16
  gt: -0.01938657949109344
-- case 1, -4.439981133843978E-16 > -4.439981133891729E-16
-- new_step: 4.6984746085145E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 28
  ft: -4.4399811338893167E-16
  gt: -0.01938657949109344
-- case 1, -4.4399811338893167E-16 > -4.439981133891729E-16
-- new_step: 4.698474602292058E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 29
  ft: -4.4399811338905227E-16
  gt: -0.01938657949109344
-- case 1, -4.4399811338905227E-16 > -4.439981133891729E-16
-- new_step: 4.698474574428066E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 30
  ft: -4.4399811338959254E-16
  gt: -0.01938657949109344
-- case 3
-- new_step: 4.698474592818301E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 31
  ft: -4.4399811338923597E-16
  gt: -0.01938657949109344
-- case 1, -4.4399811338923597E-16 > -4.4399811338959254E-16
-- new_step: 4.6984745836227235E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 32
  ft: -4.4399811338941425E-16
  gt: -0.01938657949109344
-- case 1, -4.4399811338941425E-16 > -4.4399811338959254E-16
-- new_step: 4.698474579025165E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 33
  ft: -4.439981133895034E-16
  gt: -0.01938657949109344
-- case 1, -4.439981133895034E-16 > -4.4399811338959254E-16
-- new_step: 4.698474576726386E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 34
  ft: -4.4399811338954797E-16
  gt: -0.01938657949109344
-- case 1, -4.4399811338954797E-16 > -4.4399811338959254E-16
-- new_step: 4.698474550823932E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 35
  ft: -4.439981133900502E-16
  gt: -0.01938657949109344
-- case 3
-- new_step: 4.6984745679195516E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 36
  ft: -4.439981133897187E-16
  gt: -0.01938657949109344
-- case 1, -4.439981133897187E-16 > -4.439981133900502E-16
-- new_step: 4.698474559371315E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 37
  ft: -4.4399811338988447E-16
  gt: -0.01938657949109344
-- case 1, -4.4399811338988447E-16 > -4.439981133900502E-16
-- new_step: 4.6984745200910456E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 38
  ft: -4.4399811339064606E-16
  gt: -0.01938657949109344
-- case 3
-- new_step: 4.698474559371315E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 39
  ft: -4.4399811338988447E-16
  gt: -0.01938657949109344
-- case 1, -4.4399811338988447E-16 > -4.4399811339064606E-16
-- new_step: 4.698474539729216E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 40
  ft: -4.439981133902653E-16
  gt: -0.01938657949109344
-- case 1, -4.439981133902653E-16 > -4.4399811339064606E-16
-- new_step: 4.698474529909149E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 41
  ft: -4.439981133904557E-16
  gt: -0.01938657949109344
-- case 1, -4.439981133904557E-16 > -4.4399811339064606E-16
-- new_step: 4.698474500246615E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 42
  ft: -4.439981133910308E-16
  gt: -0.01938657949109344
-- case 3
-- new_step: 4.698474529909149E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 43
  ft: -4.439981133904557E-16
  gt: -0.01938657949109344
-- case 1, -4.439981133904557E-16 > -4.439981133910308E-16
-- new_step: 4.6984745150763994E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 44
  ft: -4.439981133907433E-16
  gt: -0.01938657949109344
-- case 1, -4.439981133907433E-16 > -4.439981133910308E-16
-- new_step: 4.6984745076611366E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 45
  ft: -4.4399811339088706E-16
  gt: -0.01938657949109344
-- case 1, -4.4399811339088706E-16 > -4.439981133910308E-16
-- new_step: 4.698474479200454E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 46
  ft: -4.4399811339143887E-16
  gt: -0.01938657949109344
-- case 3
-- new_step: 4.6984745076611366E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 47
  ft: -4.4399811339088706E-16
  gt: -0.01938657949109344
-- case 1, -4.4399811339088706E-16 > -4.4399811339143887E-16
-- new_step: 4.698474493429372E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 48
  ft: -4.4399811339116296E-16
  gt: -0.01938657949109344
-- case 1, -4.4399811339116296E-16 > -4.4399811339143887E-16
-- new_step: 4.6984744863142014E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 49
  ft: -4.439981133913009E-16
  gt: -0.01938657949109344
-- case 1, -4.439981133913009E-16 > -4.4399811339143887E-16
-- new_step: 4.698474447750787E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 50
  ft: -4.439981133920486E-16
  gt: -0.01938657949109344
-- case 3
-- new_step: 4.6984744863142014E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 51
  ft: -4.439981133913009E-16
  gt: -0.01938657949109344
-- case 1, -4.439981133913009E-16 > -4.439981133920486E-16
-- new_step: 4.698474467030566E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 52
  ft: -4.439981133916748E-16
  gt: -0.01938657949109344
-- case 1, -4.439981133916748E-16 > -4.439981133920486E-16
-- new_step: 4.6984744573897126E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 53
  ft: -4.4399811339186175E-16
  gt: -0.01938657949109344
-- case 1, -4.4399811339186175E-16 > -4.439981133920486E-16
-- new_step: 4.6984744525700085E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 54
  ft: -4.439981133919552E-16
  gt: -0.01938657949109344
-- case 1, -4.439981133919552E-16 > -4.439981133920486E-16
-- new_step: 4.6984744501602776E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 55
  ft: -4.439981133920019E-16
  gt: -0.01938657949109344
-- case 1, -4.439981133920019E-16 > -4.439981133920486E-16
-- new_step: 4.698474448955472E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 56
  ft: -4.4399811339202524E-16
  gt: -0.01938657949109344
-- case 1, -4.4399811339202524E-16 > -4.439981133920486E-16
-- new_step: 4.698474448353099E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 57
  ft: -4.439981133920369E-16
  gt: -0.01938657949109344
-- case 1, -4.439981133920369E-16 > -4.439981133920486E-16
-- new_step: 4.698474448051928E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 58
  ft: -4.439981133920428E-16
  gt: -0.01938657949109344
-- case 1, -4.439981133920428E-16 > -4.439981133920486E-16
-- new_step: 4.6984744479013504E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 59
  ft: -4.439981133920457E-16
  gt: -0.01938657949109344
-- case 1, -4.439981133920457E-16 > -4.439981133920486E-16
-- new_step: 4.6984744478260615E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 60
  ft: -4.439981133920472E-16
  gt: -0.01938657949109344
-- case 1, -4.439981133920472E-16 > -4.439981133920486E-16
-- new_step: 4.698474447788421E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 61
  ft: -4.4399811339204787E-16
  gt: -0.01938657949109344
-- case 1, -4.4399811339204787E-16 > -4.439981133920486E-16
-- new_step: 4.698474447769602E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 62
  ft: -4.4399811339204826E-16
  gt: -0.01938657949109344
-- case 1, -4.4399811339204826E-16 > -4.439981133920486E-16
-- new_step: 4.6984744477601935E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 63
  ft: -4.439981133920484E-16
  gt: -0.01938657949109344
-- case 1, -4.439981133920484E-16 > -4.439981133920486E-16
-- new_step: 4.6984744477554907E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 64
  ft: -4.439981133920485E-16
  gt: -0.01938657949109344
-- case 1, -4.439981133920485E-16 > -4.439981133920486E-16
-- new_step: 4.698474447753139E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 65
  ft: -4.4399811339204856E-16
  gt: -0.01938657949109344
-- case 1, -4.4399811339204856E-16 > -4.439981133920486E-16
-- new_step: 4.6984744127456E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 66
  ft: -4.439981133927273E-16
  gt: -0.01938657949109344
-- case 3
-- new_step: 4.698474447753139E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 67
  ft: -4.4399811339204856E-16
  gt: -0.01938657949109344
-- case 1, -4.4399811339204856E-16 > -4.439981133927273E-16
-- new_step: 4.698474430247619E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 68
  ft: -4.4399811339238796E-16
  gt: -0.01938657949109344
-- case 1, -4.4399811339238796E-16 > -4.439981133927273E-16
-- new_step: 4.698474421495734E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 69
  ft: -4.4399811339255767E-16
  gt: -0.01938657949109344
-- case 1, -4.4399811339255767E-16 > -4.439981133927273E-16
-- new_step: 4.698474392367211E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 70
  ft: -4.4399811339312244E-16
  gt: -0.01938657949109344
-- case 3
-- new_step: 4.698474421495734E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 71
  ft: -4.4399811339255767E-16
  gt: -0.01938657949109344
-- case 1, -4.4399811339255767E-16 > -4.4399811339312244E-16
-- new_step: 4.6984744069300165E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 72
  ft: -4.439981133928401E-16
  gt: -0.01938657949109344
-- case 1, -4.439981133928401E-16 > -4.4399811339312244E-16
-- new_step: 4.69847439964825E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 73
  ft: -4.4399811339298124E-16
  gt: -0.01938657949109344
-- case 1, -4.4399811339298124E-16 > -4.4399811339312244E-16
-- new_step: 4.6984743960073664E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 74
  ft: -4.4399811339305184E-16
  gt: -0.01938657949109344
-- case 1, -4.4399811339305184E-16 > -4.4399811339312244E-16
-- new_step: 4.6984743941871064E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 75
  ft: -4.4399811339308714E-16
  gt: -0.01938657949109344
-- case 1, -4.4399811339308714E-16 > -4.4399811339312244E-16
-- new_step: 4.6984743932770677E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 76
  ft: -4.439981133931048E-16
  gt: -0.01938657949109344
-- case 1, -4.439981133931048E-16 > -4.4399811339312244E-16
-- new_step: 4.6984743928221164E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 77
  ft: -4.439981133931136E-16
  gt: -0.01938657949109344
-- case 1, -4.439981133931136E-16 > -4.4399811339312244E-16
-- new_step: 4.698474367841416E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 78
  ft: -4.4399811339359793E-16
  gt: -0.01938657949109344
-- case 3
-- new_step: 4.698474384328678E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 79
  ft: -4.439981133932783E-16
  gt: -0.01938657949109344
-- case 1, -4.439981133932783E-16 > -4.4399811339359793E-16
-- new_step: 4.698474376084635E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 80
  ft: -4.4399811339343814E-16
  gt: -0.01938657949109344
-- case 1, -4.4399811339343814E-16 > -4.4399811339359793E-16
-- new_step: 4.698474347209583E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 81
  ft: -4.43998113393998E-16
  gt: -0.01938657949109344
-- case 3
-- new_step: 4.698474376084635E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 82
  ft: -4.4399811339343814E-16
  gt: -0.01938657949109344
-- case 1, -4.4399811339343814E-16 > -4.43998113393998E-16
-- new_step: 4.698474361645665E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 83
  ft: -4.439981133937181E-16
  gt: -0.01938657949109344
-- case 1, -4.439981133937181E-16 > -4.43998113393998E-16
-- new_step: 4.6984743544269025E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 84
  ft: -4.43998113393858E-16
  gt: -0.01938657949109344
-- case 1, -4.43998113393858E-16 > -4.43998113393998E-16
-- new_step: 4.698474350817882E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 85
  ft: -4.43998113393928E-16
  gt: -0.01938657949109344
-- case 1, -4.43998113393928E-16 > -4.43998113393998E-16
-- new_step: 4.6984743242604065E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 86
  ft: -4.439981133944429E-16
  gt: -0.01938657949109344
-- case 3
-- new_step: 4.6984743417883404E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 87
  ft: -4.4399811339410305E-16
  gt: -0.01938657949109344
-- case 1, -4.4399811339410305E-16 > -4.439981133944429E-16
-- new_step: 4.6984743330239355E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 88
  ft: -4.43998113394273E-16
  gt: -0.01938657949109344
-- case 1, -4.43998113394273E-16 > -4.439981133944429E-16
-- new_step: 4.698474328641733E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 89
  ft: -4.4399811339435795E-16
  gt: -0.01938657949109344
-- case 1, -4.4399811339435795E-16 > -4.439981133944429E-16
-- new_step: 4.69847432645096E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 90
  ft: -4.4399811339440045E-16
  gt: -0.01938657949109344
-- case 1, -4.4399811339440045E-16 > -4.439981133944429E-16
-- new_step: 4.698474325355629E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 91
  ft: -4.439981133944217E-16
  gt: -0.01938657949109344
-- case 1, -4.439981133944217E-16 > -4.439981133944429E-16
-- new_step: 4.69847432480799E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 92
  ft: -4.439981133944323E-16
  gt: -0.01938657949109344
-- case 1, -4.439981133944323E-16 > -4.439981133944429E-16
-- new_step: 4.698474299780948E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 93
  ft: -4.4399811339491755E-16
  gt: -0.01938657949109344
-- case 3
-- new_step: 4.69847432480799E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 94
  ft: -4.439981133944323E-16
  gt: -0.01938657949109344
-- case 1, -4.439981133944323E-16 > -4.4399811339491755E-16
-- new_step: 4.698474312293218E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 95
  ft: -4.439981133946749E-16
  gt: -0.01938657949109344
-- case 1, -4.439981133946749E-16 > -4.4399811339491755E-16
-- new_step: 4.6984743060367704E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 96
  ft: -4.4399811339479626E-16
  gt: -0.01938657949109344
-- case 1, -4.4399811339479626E-16 > -4.4399811339491755E-16
-- new_step: 4.698474302908546E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 97
  ft: -4.439981133948569E-16
  gt: -0.01938657949109344
-- case 1, -4.439981133948569E-16 > -4.4399811339491755E-16
-- new_step: 4.698474301344669E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 98
  ft: -4.439981133948872E-16
  gt: -0.01938657949109344
-- case 1, -4.439981133948872E-16 > -4.4399811339491755E-16
-- new_step: 4.69847430056277E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
iter: 99
  ft: -4.4399811339490236E-16
  gt: -0.01938657949109344
-- case 1, -4.4399811339490236E-16 > -4.4399811339491755E-16
-- new_step: 4.6984743001718395E-14
     x: [   -1.175965   -1.021925]
    fx: 3.6649880842636953
  grad: [    0.291111    0.611040]
    dg: -0.019388518342927735
  wolfe cond 1: 3.6649880842636953 <= 3.6649880842636957 == true
  wolfe cond 2: 0.019388518342927735 <= 0.017449666508636403 == false
org.generateme.lbfgsb.LBFGSBException: the line search routine reached the maximum number of iterations
	at org.generateme.lbfgsb.LineSearch.<init>(LineSearch.java:305)
	at org.generateme.lbfgsb.LBFGSB.minimize(LBFGSB.java:131)
	at org.generateme.lbfgsb.examples.Bohachevsky3.main(Bohachevsky3.java:34)

Is there an issue with your cauchy points finding step?

I am questioning if there is an issue with Step 3 of finding the Cauchy Point.
You are updating the vector 'p' during the loop, but according to the paper,
'p' should remain fixed within a single iteration. I think the correct approach
to updating vector 'p' is to use a temporary vector to record its dynamics and
then add them up outside the loop.

Use LBFGS++ with Spectra

Thank you for making these two amazing libraries! Currently, I would like to the two libraries in my project. However, if I use both LBFGS++ and Spectra in a single translation unit, I would encounter compiling errors saying that symbol BKLDLT is undefined. A minimum code example to reproduce the problem is

#include "Spectra/MatOp/DenseSymShiftSolve.h"
#include "LBFGSpp/LBFGS.h"

I guess that the problem I encountered is due to the same include guard name BK_LDLT_H in the files named BKLDLT.h presented in both libraries as header dependencies. The library included later would see its BKLDLT.h as an almost empty file because the library included earlier already defines BK_LDLT_H. Normally that would be okay, because the contents in one file are simply reused for both libraries, but in this case the 2 BKLDLT.h files are almost identical except that they define class template BKLDLT in different namespaces, which makes the file in one library unusable for the other.

The problem goes away if I rename the include guard macros to make them different (such as LBFGSPP_BK_LDLT_H and SPECTRA_BK_LDLT_H). IMO this makes the header only libraries more separated and makes the include guard collisions highly unlikely. I'm curious to know what you think!

Comparison with Pythons scipy.optimize lbfgsb

My problem is with optimizing the log marginal likelihood of a Gaussian Process. I was wondering if there was any comparison done between LBFGSpp and scipys lbfgsb. Or perhaps, how do we adjust the settings of LBFGSpp such that its performance is 'like' that of scipys implementation. My results always indicate scipys version better overall. I wonder what the difference is? as LBGFSpp is necessary for my project

Thanks

Possible to estimate gradient numerically?

Hi Xuan, this is an amazing library; it is especially helpful how you set it up as most objective functions do have so many other parameters that are not the argument.

Just a quick question, is it ever on the cards to augment this library with numerically estimated gradients? I know this issue had been closed already as the original person interested in this found analytical derivatives but I really do have a problem that doesn't have analytical derivatives (only to first order and they're rubbish). Thanks so much!

Several ways to crash LBFGS++

Hi,

first off, I want to say that this is just a great library and a joy to use. Clean header-only modern C++ with Eigen, what's not to like? :)
So, I've been using LBFGS++ in a private project and got to stress test it a bit. I've come across quite a few situations which I believe should be considered completely valid but which currently trigger a std::logic_error to be thrown. I'd like to report on these and suggest a few improvements.
Firstly, it's very easy to run into an increasing search direction by minimally modifying the examples. For instance, I've encountered one by modifying example-rosenbrock.cpp to have a non-zero starting-value vector startVals with startVals[n] = 1.0F and startVals[n + 1] = 6.0F. The culprit is in the update procedure. BFGS and friends assume that the Hessian matrix approximation is positive definite. This can only be maintained if y_k \cdot s_k is positive at each step (I follow Wikipedia, Wikipedia, and these lectures notes in notation, which appears to be standard). This is automatically true for convex functions with arbitrary line searches or for arbitrary functions with line searches that enforce the (strong) Wolfe conditions.
example-rosenbrock.cpp features the non-convex Rosenbrock function and the default Armijo backtracking line search, which can easily cause y_k \cdot s_k < 0, which in turn can and frequently does cause a non-positive-definite (inverse) Hessian estimate, which in turn can and frequently does produce an ascending search direction.
For starters, I think the default line search should be changed from LineSearchBacktracking (which is also described as "Mainly for internal use" in its source file) to LineSearchNocedalWright from #6 which respects the Wolfe conditions. That should make the examples less prone to crashing after simple modifications. Btw, using LineSearchNocedalWright also appears to resolve #21.
One can, however, keep meaningfully optimizing after encountering y_k \cdot s_k < 0 even without enforcing the Wolfe conditions. The canonical trick, according to the very end of Chapter 8 in the previously referenced notes, is to simply restart the optimization from the current point, i.e., throw away all of your Hessian information (the y and s vectors) and start over. The first BFGS step is always just gradient descent, so this should avoid the ascending search direction.
I believe y_k \cdot s_k < 0 can occur with non-convex functions even when trying to enforce the Wolfe conditions when the problem has bounds - as a simple example, imagine a 1D concave function defined over a bounded interval.
For consistency, non-convex functions should either be prohibited when using line searches that don't enforce the Wolfe conditions or when there are bounds, or the restarting logic should be added to BFGSMat. The latter is much more attractive from a user's point of view, but it's probably not a one-liner :)
Even with all this in place, exceptional situations, which currently cause a std::logic_error to be thrown, can probably still occur due to rounding errors, functions that are not bounded from below, bugs in the functor implementations, etc. These situations should therefore continue being handled, though not necessarily by means of exception throwing. I agree with #17 that other mechanisms of reporting the manner of termination would be more user-friendly. But even if exceptions stay, I strongly feel that they should be a custom exception type derived from std::runtime_error rather than a std::logic_error. The latter exception type is intended to mark preventable programmer errors, which is not always the cause in LBFGS++. Since exceptions propagate through the entire program, which might not even care that LBFGS++ is called somewhere down the stack, I'd argue it's best to stick to conventions.
It's also possible to get the box-constrained optimizer to crash on a lightly modified example. I managed to do so by modifying example-rosenbrock.cpp to use LBFGSB* rather than LBFGS* types, having a non-zero starting-value vector startVals with startVals[n] = 1.0F and startVals[n + 1] = 95.0F, and by bounding all of the variables to [0, \infty] (but you never know with floating point arithmetic - playing around with starting values might be necessary to get a crash on your system).
I've traced the crash to a big old typo - I'm pretty sure the fI_lo on this line should be I_lo.
I haven't yet stared at the Moré-Thuente paper long enough to be confident about these, but I also find two other details potentially troublesome. First, the second paragraph of section 4 mentions that the values and derivatives of \psi should be evaluated when generating trial step values until \psi(\alpha) \le 0 and \phi'(\alpha) \ge 0 is satisfied, after which \phi should be evaluated instead of \psi, and I don't see that switch in the code. And second, at the top of page 300 it is stated that the formula that is used for case 3 in step_selection before delta-scaling should only be used "If the cubic tends to infinity in the direction of the step and the minimum of the cubic is beyond \alpha_k" and that one should set \alpha_k = \alpha_s otherwise. I do not see any special handling of the cubic in case 3 in the code. But I might be misunderstanding some part of the code. What do you think?
To sum up, here are my suggestions ordered roughly by decreasing simplicity and urgency.

  • Fix the fI_lo typo.
  • Make LineSearchNocedalWright the default BFGS line search.
  • Add restarting logic to BFGSMat on encountering y_k \cdot s_k < 0.
  • Check the potential \phi/\psi and case-3 cubic issues of the Moré-Thuente search.
  • Replace exceptions with something more lightweight or at least change exception type.

Thanks again for putting this great library out there. I'm looking forward to seeing it become even greater with some of the corner cases above getting covered.

Examples hang on Linux/ppc64le

Trying to test LBFGS++ on Linux/ppc64le leads to the example programs in this repository hanging.

Test log
 + set -ex
 + for i in 'examples/*.cpp'
 + /home/conda/feedstock_root/build_artifacts/lbfgspp_1656534905612/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehol/bin/powerpc64le-conda-linux-gnu-c++ -fvisibility-inlines-hidden -std=c++17 -fmessage-length=0 -mcpu=power8 -mtune=power8 -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O3 -pipe -isystem /home/conda/feedstock_root/build_artifacts/lbfgspp_1656534905612/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehol/include -fdebug-prefix-map=/home/conda/feedstock_root/build_artifacts/lbfgspp_1656534905612/test_tmp=/usr/local/src/conda/lbfgspp-0.2.0 -fdebug-prefix-map=/home/conda/feedstock_root/build_artifacts/lbfgspp_1656534905612/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehol=/usr/local/src/conda-prefix -I/home/conda/feedstock_root/build_artifacts/lbfgspp_1656534905612/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehol/include/eigen3 -o examples/example-quadratic.cpp.out examples/example-quadratic.cpp
 + examples/example-quadratic.cpp.out
 2 iterations
 x = 
 0 1 2 3 4 5 6 7 8 9
 f(x) = 6.76695e-30
 + for i in 'examples/*.cpp'
 + /home/conda/feedstock_root/build_artifacts/lbfgspp_1656534905612/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehol/bin/powerpc64le-conda-linux-gnu-c++ -fvisibility-inlines-hidden -std=c++17 -fmessage-length=0 -mcpu=power8 -mtune=power8 -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O3 -pipe -isystem /home/conda/feedstock_root/build_artifacts/lbfgspp_1656534905612/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehol/include -fdebug-prefix-map=/home/conda/feedstock_root/build_artifacts/lbfgspp_1656534905612/test_tmp=/usr/local/src/conda/lbfgspp-0.2.0 -fdebug-prefix-map=/home/conda/feedstock_root/build_artifacts/lbfgspp_1656534905612/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehol=/usr/local/src/conda-prefix -I/home/conda/feedstock_root/build_artifacts/lbfgspp_1656534905612/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehol/include/eigen3 -o examples/example-rosenbrock-box.cpp.out examples/example-rosenbrock-box.cpp
 In file included from /home/conda/feedstock_root/build_artifacts/lbfgspp_1656534905612/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehol/include/eigen3/Eigen/Core:350,
                  from examples/example-rosenbrock-box.cpp:1:
 /home/conda/feedstock_root/build_artifacts/lbfgspp_1656534905612/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehol/include/eigen3/Eigen/src/Core/arch/AltiVec/MatrixProduct.h: In function 'static void Eigen::internal::general_matrix_matrix_product<Index, LhsScalar, LhsStorageOrder, ConjugateLhs, RhsScalar, RhsStorageOrder, ConjugateRhs, 0, ResInnerStride>::run(Index, Index, Index, const LhsScalar*, Index, const RhsScalar*, Index, Eigen::internal::general_matrix_matrix_product<Index, LhsScalar, LhsStorageOrder, ConjugateLhs, RhsScalar, RhsStorageOrder, ConjugateRhs, 0, ResInnerStride>::ResScalar*, Index, Index, Eigen::internal::general_matrix_matrix_product<Index, LhsScalar, LhsStorageOrder, ConjugateLhs, RhsScalar, RhsStorageOrder, ConjugateRhs, 0, ResInnerStride>::ResScalar, Eigen::internal::level3_blocking<LhsScalar, RhsScalar>&, Eigen::internal::GemmParallelInfo<Index>*) [with Index = long int; LhsScalar = double; int LhsStorageOrder = 1; bool ConjugateLhs = false; RhsScalar = double; int RhsStorageOrder = 0; bool ConjugateRhs = false; int ResInnerStride = 1]':
 /home/conda/feedstock_root/build_artifacts/lbfgspp_1656534905612/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehol/include/eigen3/Eigen/src/Core/arch/AltiVec/MatrixProduct.h:2808:34: warning: builtin '__builtin_cpu_supports' needs GLIBC (2.23 and newer) that exports hardware capability bits
  2808 |       if (__builtin_cpu_supports ("arch_3_1") && __builtin_cpu_supports ("mma")){
       |           ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~
 /home/conda/feedstock_root/build_artifacts/lbfgspp_1656534905612/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehol/include/eigen3/Eigen/src/Core/arch/AltiVec/MatrixProduct.h:2808:73: warning: builtin '__builtin_cpu_supports' needs GLIBC (2.23 and newer) that exports hardware capability bits
  2808 |       if (__builtin_cpu_supports ("arch_3_1") && __builtin_cpu_supports ("mma")){
       |                                                  ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~
 + examples/example-rosenbrock-box.cpp.out
 13 iterations
 x = 
       2       2 1.64743       2       2       2       2       2       2       2       2       2       2       2       2       2       2       2       2       2       2       2       2 2.10909       4
 f(x) = 360.284
 + for i in 'examples/*.cpp'
 + /home/conda/feedstock_root/build_artifacts/lbfgspp_1656534905612/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehol/bin/powerpc64le-conda-linux-gnu-c++ -fvisibility-inlines-hidden -std=c++17 -fmessage-length=0 -mcpu=power8 -mtune=power8 -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O3 -pipe -isystem /home/conda/feedstock_root/build_artifacts/lbfgspp_1656534905612/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehol/include -fdebug-prefix-map=/home/conda/feedstock_root/build_artifacts/lbfgspp_1656534905612/test_tmp=/usr/local/src/conda/lbfgspp-0.2.0 -fdebug-prefix-map=/home/conda/feedstock_root/build_artifacts/lbfgspp_1656534905612/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehol=/usr/local/src/conda-prefix -I/home/conda/feedstock_root/build_artifacts/lbfgspp_1656534905612/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehol/include/eigen3 -o examples/example-rosenbrock-bracketing.cpp.out examples/example-rosenbrock-bracketing.cpp
 + examples/example-rosenbrock-bracketing.cpp.out
 n = 2
 Test passed!
 
 n = 4
 Test passed!
 
 n = 6
 Test passed!
 
 n = 8
 Test passed!
 
 n = 10
 Test passed!
 
 n = 12
 Test passed!
 
 n = 14
 Test passed!
 
 n = 16
 Test passed!
 
 + for i in 'examples/*.cpp'
 + /home/conda/feedstock_root/build_artifacts/lbfgspp_1656534905612/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehol/bin/powerpc64le-conda-linux-gnu-c++ -fvisibility-inlines-hidden -std=c++17 -fmessage-length=0 -mcpu=power8 -mtune=power8 -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O3 -pipe -isystem /home/conda/feedstock_root/build_artifacts/lbfgspp_1656534905612/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehol/include -fdebug-prefix-map=/home/conda/feedstock_root/build_artifacts/lbfgspp_1656534905612/test_tmp=/usr/local/src/conda/lbfgspp-0.2.0 -fdebug-prefix-map=/home/conda/feedstock_root/build_artifacts/lbfgspp_1656534905612/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehol=/usr/local/src/conda-prefix -I/home/conda/feedstock_root/build_artifacts/lbfgspp_1656534905612/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehol/include/eigen3 -o examples/example-rosenbrock-comparison.cpp.out examples/example-rosenbrock-comparison.cpp
 + examples/example-rosenbrock-comparison.cpp.out
 n = 2
   Average #calls:
   LineSearchBacktracking : 34 calls, 24 iterations
   LineSearchBracketing   : 33 calls, 24 iterations
   LineSearchNocedalWright: 35 calls, 26 iterations
 n = 4
   Average #calls:
   LineSearchBacktracking : 56 calls, 43 iterations
   LineSearchBracketing   : 56 calls, 43 iterations
   LineSearchNocedalWright: 57 calls, 44 iterations
 n = 6
   Average #calls:
   LineSearchBacktracking : 109 calls, 89 iterations
   LineSearchBracketing   : 108 calls, 88 iterations
   LineSearchNocedalWright: 100 calls, 83 iterations
 n = 8
   Average #calls:
   LineSearchBacktracking : 130 calls, 109 iterations
   LineSearchBracketing   : 130 calls, 109 iterations
   LineSearchNocedalWright: 120 calls, 103 iterations
 n = 10
   Average #calls:
   LineSearchBacktracking : 136 calls, 114 iterations
   LineSearchBracketing   : 136 calls, 114 iterations
   LineSearchNocedalWright: 126 calls, 109 iterations
 n = 12
   Average #calls:
   LineSearchBacktracking : 140 calls, 118 iterations
   LineSearchBracketing   : 139 calls, 118 iterations
   LineSearchNocedalWright: 130 calls, 114 iterations
 n = 14
   Average #calls:
   LineSearchBacktracking : 144 calls, 122 iterations
   LineSearchBracketing   : 144 calls, 122 iterations
   LineSearchNocedalWright: 134 calls, 117 iterations
 n = 16
   Average #calls:
   LineSearchBacktracking : 146 calls, 124 iterations
   LineSearchBracketing   : 145 calls, 123 iterations
   LineSearchNocedalWright: 134 calls, 118 iterations
 n = 18
   Average #calls:
   LineSearchBacktracking : 148 calls, 126 iterations
   LineSearchBracketing   : 149 calls, 126 iterations
   LineSearchNocedalWright: 137 calls, 120 iterations
 n = 20
   Average #calls:
   LineSearchBacktracking : 150 calls, 127 iterations
   LineSearchBracketing   : 150 calls, 127 iterations
   LineSearchNocedalWright: 136 calls, 120 iterations
 n = 22
   Average #calls:
   LineSearchBacktracking : 150 calls, 128 iterations
   LineSearchBracketing   : 150 calls, 128 iterations
   LineSearchNocedalWright: 140 calls, 123 iterations
 n = 24
   Average #calls:
   LineSearchBacktracking : 152 calls, 129 iterations
   LineSearchBracketing   : 152 calls, 129 iterations
   LineSearchNocedalWright: 140 calls, 123 iterations
 + for i in 'examples/*.cpp'
 + /home/conda/feedstock_root/build_artifacts/lbfgspp_1656534905612/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehol/bin/powerpc64le-conda-linux-gnu-c++ -fvisibility-inlines-hidden -std=c++17 -fmessage-length=0 -mcpu=power8 -mtune=power8 -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O3 -pipe -isystem /home/conda/feedstock_root/build_artifacts/lbfgspp_1656534905612/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehol/include -fdebug-prefix-map=/home/conda/feedstock_root/build_artifacts/lbfgspp_1656534905612/test_tmp=/usr/local/src/conda/lbfgspp-0.2.0 -fdebug-prefix-map=/home/conda/feedstock_root/build_artifacts/lbfgspp_1656534905612/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehol=/usr/local/src/conda-prefix -I/home/conda/feedstock_root/build_artifacts/lbfgspp_1656534905612/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehol/include/eigen3 -o examples/example-rosenbrock.cpp.out examples/example-rosenbrock.cpp
 + examples/example-rosenbrock.cpp.out
 ##[error]The operation was canceled.
 ##[section]Finishing: Run docker build

Build log: https://app.travis-ci.com/github/conda-forge/lbfgspp-feedstock/jobs/575234358

the include file cause problems

in the file "LBFGSpp/include/LBFGSpp/LineSearchMoreThuente.h"
the line 9:
#include "LBFGSpp/Param.h" make compilation fail

when modify it to be "#include "Param.h" compilation is sucessful

Numerical gradient estimation

Hi Yixuan. Do you think it would be possible to make a version where the gradient could be estimated numerically? I am trying to optimise the parameters of the covariance functions (kernels) in a Gaussian Process, and I am not sure if I can compute the gradient explicitly there. (Alternatively, if you have any pointers on how to estimate the derivative of the log-likelihood of the GP w.r.t. parameters of the kernels, I would really appreciate it.) Thanks!

Allow to use symbolic Hessian

Hello! It is an awesome library, thank you!

I have one question - do you plan to support an option to use a symbolic (so, not numerically approximated) Hessian? I have some small-scale problem which requires high accuracy and therefore the symbolic Hessian suits better than numerical one.

[PR Suggestion] Remove runtime exceptions

I was wondering whether you would be interested in a patch that removed runtime exceptions, opting instead for returning either a boolean, enumeration or some sort of std::optional value to report whether a minimization went well or not.

I am not proposing removing exception related to erroneous parameters (so all std::invalid_arguments make sense for me).

I ask as exceptions can be relatively costly computationally when used as a failure status report, and given that one has no way to know in advance whether the provided inputs will result in failure or not, throwing an exception seems unduly harsh on the caller. In my current usage, for example, I am simply wrapping minimize calls in a try/except block that simply logs failures.

Please let me know if you would be interested in this, and I can try to provide a PR :)

Trouble converging

Hi, I'm trying to estimate the parameters of a GP with a standard RBF kernel. I'm having some trouble converging to the same parameters I get when implementing the code in Python. I am using the current master branch (not sure if I should be using a particular commit).

With LBFGSSolver the search seems to never end, going in loops. With LBFGSBSolver, after around ~20 iterations the search ends with a thrown exception. I've tried modifying the parameters to make it look longer, but it never really converges more than what it already achieved.

LBFGSBSolver gets somewhat close to the optimal solution found via Python (1.3455902517905933 0.7625631641671134 with the negative log likelihood at 5.8842014), but since it then throws I don't want to simply use a try-catch if it's not stopping cleanly (since maybe it could actually get to the same solution as in Python).

I've double checked my gradients, and I hope I haven't made silly mistakes. Could you help me figure out if there's something I should be doing differently? Here is my code, if it helps:

#include <iostream>
#include <Eigen/Eigenvalues>

#include <LBFGSB.h>
#include <LBFGS.h>

constexpr double PI = 3.141592653589793238462643383279502884197169399375105820974944;

using Matrix2D = Eigen::Matrix<double, Eigen::Dynamic, Eigen::Dynamic, Eigen::RowMajor | Eigen::AutoAlign>;
using Vector = Eigen::Matrix<double, Eigen::Dynamic, 1>;

Matrix2D makeDist(const Matrix2D & x1, const Matrix2D & x2) {
    return ((-2.0 * (x1 * x2.transpose())).colwise() + x1.rowwise().squaredNorm()).rowwise() + x2.rowwise().squaredNorm().transpose();
}

Matrix2D kernel(const Matrix2D & x1, const Matrix2D & x2, double l = 1.0, double sigma = 1.0) {
    auto dist = makeDist(x1, x2);
    return std::pow(sigma, 2) * ((-0.5 / std::pow(l, 2)) * dist).array().exp();
}

struct Likelihood {
    Likelihood(const Matrix2D & xData, const Vector & muData, double noise) :
            xData_(xData), muData_(muData), noise_(noise) {}

    const Matrix2D & xData_;
    const Vector & muData_;
    double noise_;

    double operator()(const Eigen::VectorXd& x, Eigen::VectorXd& grad) {
        const auto l = x[0];
        const auto sigma = x[1];

        auto k = kernel(xData_, xData_, l, sigma);
        k.diagonal().array() += std::pow(noise_, 2);

        auto kinv = k.inverse();

        // Compute derivatives as per
        // https://math.stackexchange.com/questions/1030534/gradients-of-marginal-likelihood-of-gaussian-process-with-squared-exponential-co
        Matrix2D dkl = k.array() * (makeDist(xData_, xData_) / std::pow(l, 3)).array();
        auto dks = 2 / sigma * k;

        grad[0] = 0.5 * muData_.transpose() * kinv * dkl * kinv * muData_ - 0.5 * (kinv * dkl).trace();
        grad[1] = 0.5 * muData_.transpose() * kinv * dks * kinv * muData_ - 0.5 * (kinv * dks).trace();
        // Minimize negative log-likelihood
        grad = -grad;

        // Compute log likelihood (correct)
        auto L = k.llt();
        Vector alpha = L.solve(muData_);

        const double logLikelihood = -0.5 * muData_.transpose() * alpha - L.matrixL().selfadjointView().diagonal().array().log().sum() - 0.5 * muData_.size() * std::log(2.0 * PI);

        // Print current parameters
        std::cout << l << ", " << sigma << " ==> " << -logLikelihood << "  -  " << grad.transpose() << std::endl;

        // Return negative log-likelihood
        return -logLikelihood;
    }
};

int main() {
    const double noise = 0.4;

    // Some test points to train on.
    Matrix2D X_train(6, 1);
    X_train << -5, -4, -3, -2, -1, 1;

    Vector Y_train = X_train.array().sin().array();

    // Optimizer params
    LBFGSpp::LBFGSBParam<double> param;
    param.epsilon = 1e-5;
    param.max_iterations = 100;

    param.m = 10;
    param.ftol = 0.4;

    LBFGSpp::LBFGSParam<double> paramx;
    param.epsilon = 1e-5;
    param.max_iterations = 100;

    // Bounds
    Vector lb = Vector::Constant(2, 1e-8);
    Vector ub = Vector::Constant(2, std::numeric_limits<double>::infinity());
    // Initial guess
    Vector x = Vector::Constant(2, 1.0);

    LBFGSpp::LBFGSBSolver<double> solver(param);
    LBFGSpp::LBFGSSolver<double> solver2(paramx);

    // Log-likelihood function with set data/noise
    Likelihood like(X_train, Y_train, noise);

    double log;
    // solver.minimize(like, x, log, lb, ub);
    solver2.minimize(like, x, log);

    return 0;
}

The optimization returns 6 precision numbers

public:
    ConstrainedLinearLeastSquare(const Eigen::MatrixXd& A, const Eigen::VectorXd& b) : A(A), b(b) {}

    double operator()(const Eigen::VectorXd& x, Eigen::VectorXd& grad) {
        Eigen::VectorXd r = A * x - b;
        double value = 0.5 * r.squaredNorm();
        grad = A.transpose() * r;
        return value;
    }

private:
    const Eigen::MatrixXd& A;
    const Eigen::VectorXd& b;
};

Eigen::VectorXd lsqlin(const Eigen::MatrixXd& A, const Eigen::VectorXd& b, const Eigen::VectorXd& l_bound, const Eigen::VectorXd& u_bound) {
    double epsilon = 1e-12;
    int max_iterations = 200;

    LBFGSpp::LBFGSBParam<double> param;
    param.epsilon = epsilon;
    param.epsilon_rel = epsilon;
    param.max_iterations = max_iterations;

    LBFGSpp::LBFGSBSolver<double> solver(param);
    ConstrainedLinearLeastSquare fun(A, b);

    Eigen::VectorXd x = 0.5 * (l_bound + u_bound);

    // x will be overwritten to be the best point found
    double fx;
    int niter = solver.minimize(fun, x, fx, l_bound, u_bound);

    return x;
}

the variable x (found minimum argument) has 6 precision. How to increase it?

compare with some optimization tools?

some simple function
problem other tools calculate time tol
problem1 osqp ? ?
problem2 ipopt ?
probleme acado ?
would you like to share this compare data?
thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.