jwetzl / cudalbfgs Goto Github PK

This is a cross-platform, CUDA-based C++ library for general-purpose, unconstrained nonlinear optimization on the GPU. It implements the L-BFGS (“Limited-memory Broyden-Fletcher-Goldfarb-Shanno“) method, a popular Quasi-Newton variant with a low memory footprint.

C 0.05% C++ 1.15% Cuda 1.71% Python 0.02% CMake 0.34% Objective-C 96.72%

cudalbfgs's People

Contributors

Stargazers

Watchers

Forkers

bebekifis guowt richychen chagge maydaygmail khs26 zhmz90 chhshen moushuai cucdn mingliangfu soledad89 caomw harsh-nod zuoyan007 minhpvo romainbrault alpc72 kareon77 codeaudit tomerwei songye-cui ibragim cedricartigue mikewerth1 chomolungma x-ma nsridhar1 mardukbp jameskeaveney ktsumura csdrs zeta1999 danhlephuoc borongyuan msnh2012 guowu-mcgill phyboyzhang efuchey rikurantanen

cudalbfgs's Issues

Uninitialized state variable

Hi there,

I happened to find that in your kernel function void strongWolfePhase1(bool second_iter) (in linesearch_gpu.h), the variable status is not initialized. So if the old value stored in device memory happened to be 2, bad things will happen.

Results were different with different compute compatability

I wrote a program with cudalbfgs and tested it with 550Ti and GTX 760. The result with former card looks like normal, but result with GTX 760 is incorrect(most of the values are zero). So I am wondering do I have to be aware of something when using different card with different compute compatibility?

need more details on how to building?

so far, I had the 'cmake ..' and make done!

how to do the foillowing??

build a reference implementetation on CPU with
either float or double precision (requires Eigen),
build test cases,
enable error checking, verbose output and timing
build example projects that demonstrate how the
library is used (cf. /projects directory).

Line search failed

I try the quadratic example, using size_t n = 5000, the result shows "Line search failed"

The script ignores Compute v2.1

There is a problem with cuda_compute_capability.c. It is caused by:

    if (major == 2 && minor == 1)
    {
        // There is no --arch compute_21 flag for nvcc, so force minor to 0
        minor = 0;
    }

See, the problem is that some Fermi cards do support Compute v2.1. In fact, Compute v2.1 exists (see https://en.wikipedia.org/wiki/CUDA#Supported_GPUs) but the way to activate that would be through setting the flags as -arch compute_20 -code sm_21.

The script currently assumes that whatever compute_xx is, sm should be also followed by the same number and be set as sm_xx. I've ran into problems with sm_20 on a machine that supports sm_21 before. For instance, I vaguely recall that numerical computations were more accurate with sm_21 than sm_20 (on the Caffe library if I recall). Considering the large number of CMake scripts out there that rely on this script, I hope the issue is fixed :)

Unfortunately, my knowledge about CMake is rather limited, otherwise I would've fixed it and submitted a pull request.

jwetzl / cudalbfgs Goto Github PK

cudalbfgs's People

Contributors

Stargazers

Watchers

Forkers

cudalbfgs's Issues

Uninitialized state variable

Results were different with different compute compatability

need more details on how to building?

Line search failed

The script ignores Compute v2.1

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent