Coder Social home page Coder Social logo

jwetzl / cudalbfgs Goto Github PK

View Code? Open in Web Editor NEW
131.0 131.0 42.0 776 KB

This is a cross-platform, CUDA-based C++ library for general-purpose, unconstrained nonlinear optimization on the GPU. It implements the L-BFGS (“Limited-memory Broyden-Fletcher-Goldfarb-Shanno“) method, a popular Quasi-Newton variant with a low memory footprint.

C 0.05% C++ 1.15% Cuda 1.71% Python 0.02% CMake 0.34% Objective-C 96.72%

cudalbfgs's People

Contributors

jwetzl avatar mikewerth1 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cudalbfgs's Issues

Uninitialized state variable

Hi there,

I happened to find that in your kernel function void strongWolfePhase1(bool second_iter) (in linesearch_gpu.h), the variable status is not initialized. So if the old value stored in device memory happened to be 2, bad things will happen.

Results were different with different compute compatability

I wrote a program with cudalbfgs and tested it with 550Ti and GTX 760. The result with former card looks like normal, but result with GTX 760 is incorrect(most of the values are zero). So I am wondering do I have to be aware of something when using different card with different compute compatibility?

need more details on how to building?

so far, I had the 'cmake ..' and make done!

how to do the foillowing??

  • build a reference implementetation on CPU with
    either float or double precision (requires Eigen),
  • build test cases,
  • enable error checking, verbose output and timing
  • build example projects that demonstrate how the
    library is used (cf. /projects directory).

Line search failed

I try the quadratic example, using size_t n = 5000, the result shows "Line search failed"

The script ignores Compute v2.1

There is a problem with cuda_compute_capability.c. It is caused by:

    if (major == 2 && minor == 1)
    {
        // There is no --arch compute_21 flag for nvcc, so force minor to 0
        minor = 0;
    }

See, the problem is that some Fermi cards do support Compute v2.1. In fact, Compute v2.1 exists (see https://en.wikipedia.org/wiki/CUDA#Supported_GPUs) but the way to activate that would be through setting the flags as -arch compute_20 -code sm_21.

The script currently assumes that whatever compute_xx is, sm should be also followed by the same number and be set as sm_xx. I've ran into problems with sm_20 on a machine that supports sm_21 before. For instance, I vaguely recall that numerical computations were more accurate with sm_21 than sm_20 (on the Caffe library if I recall). Considering the large number of CMake scripts out there that rely on this script, I hope the issue is fixed :)

Unfortunately, my knowledge about CMake is rather limited, otherwise I would've fixed it and submitted a pull request.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.