Coder Social home page Coder Social logo

climin's People

Contributors

adria-p avatar akosiorek avatar bachard avatar bayerj avatar luk0r avatar msoelch avatar ogh avatar orangeboreal avatar osdf avatar rueckstiess avatar superbobry avatar surban avatar tirune avatar warmspringwinds avatar wiebke avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

climin's Issues

Signal handling fails for certain signals

When using the criterion climin.stops.OnSignal with signal.SIGUSR1 the module fails with the following message:

Exception TypeError: 'signal handler must be signal.SIG_IGN, signal.SIG_DFL, or a callable object' in <bound method OnSignal.del of <climin.stops.stops.OnSignal object at 0x22b58fd0>> ignored

Add Hessian free optimizer

There are several public releases already,

which can serve as inspiration. I think that the first one is mainly interesting because it looks maximally simple compared to the others.

The signature should be

def __init__(self, f, fprime, f_Hp, ...)

which means that most of the complexity of HF (mainly the whole Gauss-Newton thing) are abstracted away. Structural damping also is part of f_Hp.

We will need a specialized version of CG, though.

initial value used in rmsprop

@bayerj
The 152-th line of rmsprop.py

        self.moving_mean_squared = 1

I think the initial value should be 0 instead of 1.
Any reason why 1 is better than 0

Error in tutorial

Hello.

At the following page: https://climin.readthedocs.org/en/latest/tutorial.html, there is an error.

The following code is found about halfway down:

import climin
opt = climin.GradientDescent(parameters, d_loss_wrt_pars, step_rate=0.1, momentum=.95, args=args)

The problem is step_rate is not an option for climin.GradientDescent, and it should be steprate instead.

Installing climin on windows x64, anaconda 2.5 and python 2.7

Let me point out that with anaconda 2.5, python 2.7 and windows x64, after installing climin with pip, it is necessary to comment out this section in init.py of climin.

if sys.platform == 'win32':

basepath = imp.find_module('numpy')[1]

ctypes.CDLL(os.path.join(basepath, 'core', 'libmmd.dll'))

ctypes.CDLL(os.path.join(basepath, 'core', 'libifcoremd.dll'))

then climin runs superbly.

Today 64-bit machines are probably the majority and 32-bit machines an exception.

Test reorganization

Currently, most of the optimizers have different tests. There should be a central module containing three functions to optimize:

  • a skewed quadratic function,
  • Rosenbrock function,
  • a machine learning model that uses data and args/kwargs.

The last one needs implementation of the model and the creation of a simple data set for which the finding of the global minimum is easily verifiable. I am thinking of something like a mixture of two Gaussians and logistic regression.

Add climin to PyPI?

Hi, I'd love to see this package in PyPI. Any plans to do so? It would help installing the package and declaring it as a dependency.

Just to reserve the package name and to test if things work, I registered and uploaded the most recent version to PyPI. I will either remove the package from PyPI or move the PyPI package ownership to you, whatever you wish.

In order to use PyPI, you'd need to fix version numbering to follow PEP 0440: https://www.python.org/dev/peps/pep-0440/. Thus, you would need to change the version number in setup.py to something like 0.1a1, 0.1b4, 0.1rc2 etc.

Uploading to PyPI can be done as python setup.py sdist upload.

If you are interested in this and need any help, I'd be happy to help if I can.

Make OnSignal work on windows

There is an issue with FORTAN libraries replacing signal handlers and not being able to recover them:

http://stackoverflow.com/questions/15457786/ctrl-c-crashes-python-after-importing-scipy-stats

The solution is to

Add to climin/__init__.py:

if sys.platform == 'win32':
  basepath = imp.find_module('numpy')[1]
  ctypes.CDLL(os.path.join(basepath, 'core', 'libmmd.dll'))
  ctypes.CDLL(os.path.join(basepath, 'core', 'libifcoremd.dll'))

And then extend OnSignal with

import win32api
win32api.SetConsoleCtrlHandler(self._console_ctrl_handler, 1)

in the windows case.

After that, climin has to be imported before scipy by the user.

Stopping criterions and a convenience function to optimize

Several reasons might lead you to stop optimization:

  1. the gradient is 0,
  2. the change of the parameters is negligible,
  3. a finite amout of time has passed,
  4. a desired error has been reached,
  5. a finite amount of function/gradient evaluations has been done,
  6. a finite amount of iterations has been done.

It would be nice to have convenience functions for this. Most are easy, but e.g. 2. needs to keep track of values--thus, the stopping criterion is stateful. I have a feeling we will overshoot if we solve all of these.

Support to complex numbers

Optimization schemes with complex numbers are widely used in physics, and recently, machine learning.

I strongly suggest to add the support to complex numbers for optimization engines like RmsProp et. al.

We just need a few lines of change and several tests.

E.g. climin/rmsprop.py line 165-167

            self.moving_mean_squared = (
                self.decay * self.moving_mean_squared
                + (1 - self.decay) * gradient ** 2) 
            --> + (1 - self.decay) * np.abs(gradient) ** 2)

A single line of change would make it applicable for complex numbers.

The same is true for Adam and Adadelta.

On the other side, GradientDescent works well already without any change.

Maybe a bit effort is needed for Rprop, I have no clue yet how to make it compatible with complex numbers due to the ill defined sign function for complex numbers.

potential bug in adadelta

@bayerj
It seems there is a bug in adadelta.py when momentum is used.
The momentum correction can be applied to adadelta, rmrprop and others stochastic updates.
The potential bug is at the 110-th line of adadelta.py

    def _iterate(self):
        for args, kwargs in self.args:
            step_m1 = self.step
            d = self.decay
            o = self.offset
            m = self.momentum
            step1 = step_m1 * m * self.step_rate
            self.wrt -= step1

            gradient = self.fprime(self.wrt, *args, **kwargs)

            self.gms = (d * self.gms) + (1 - d) * gradient ** 2
            step2 = sqrt(self.sms + o) / sqrt(self.gms + o) * gradient * self.step_rate
            self.wrt -= step2

            self.step = step1 + step2
            self.sms = (d * self.sms) + (1 - d) * self.step ** 2

            self.n_iter += 1

            yield {
                'n_iter': self.n_iter,
                'gradient': gradient,
                'args': args,
                'kwargs': kwargs,
            }

I think it should be step1 = step_m1 * m instead of step1 = step_m1 * m * self.step_rate.
Correct me if I am wrong.

Note that in rmsprop.py, the 160-th line is

 step1 = step_m1 * self.momentum

, which is correct.

Remove stop

I just realized that if we never calculate more than needed in the optimizers loop (because it can be done from the outside) we actually don't need the stop functionality. Yields are rather fast (compared to model evaluations). This would make code a lot easier.

Any objections?

Make the wrt variable optionally be a pair containing a setter and a getter

Currently, the signature of optimizers only allows the following:

def __init__(self, wrt, f, ...):
    # ...

However, in some cases, this can lead to problems with the GPU: e.g. theano does not guarantee that changing shared variables (e.g. retrieved with borrow=True) will actually change the real thing in the background.

I therefore suggest to add the following behaviour:

def __init__(self, wrt, f, ...):
    if isinstance(wrt, tuple):
       self._get_wrt, self._set_wrt = wrt
    else:
        # the old methods from below work for the array
        self._wrt = wrt

def _get_wrt(self): return self._wrt
def _set_wrt(self, val): self._wrt[:] = val

The downside is that we will lose some inplace operations. But I am not too sure whether that is actually the case.

Make info dictionary consistent in all optimizers

currently the info dictionary is not consistent across optimizers. Some return a lot of information, some barely anything. Certain information, like number of iteration n_iter should be returned in all optimizers. This becomes particularly important for the stopping condition mechanism. The stopping conditions (most likely?) work on the info dict and require certain labels, like wrt, loss, n_iter etc.

Can this be done in the base class even? So that optimizer-independent information is added to the dict in the base class and the optimizer only adds specific information to it before yielding?

Also, some optimizers use dict(...) syntax and others the {...} syntax to create the dict. Make this consistent!

Should the args iterator be divided into iter_args/iter_kwargs instead?

The construction of different arguments for each iteration of the optimizer is somehow tedious. However, in the case of some optimizers (HF, KSD) there are several arguments. E.g., KSD needs a different argument iterator for the gradient calcuation, the subspace construction and the inner loop. If that would result in 6 different arguments passed to the constructor, that'd be rather ugly.

Needs more thought.

Line searches need to cache function values and gradients

Currently, some line searches do not cache the function values at the last step. The callers thus have to evaluate those again if necessary (or if the user wants to inspect them) which is expensive in several cases, e.g. batch learning.

There should be a unified API on how to get the latest f and f' results.

GD does not accept sequence for `step_rate`

Contrary to what the docstring says, gradient descent does not accept a sequence for the step_rate parameter.

I'm happy to submit a pull request for this, if there's still interest!

Convergence/Divergence

So we had that issue yesterday that climin should never stop iterating because it thinks it converged. However, it might happen that it diverges and that that needs to be handled.

E.g. I am currently playing around with logistic regression and NCG which sometimes diverges because the direction becomes invalid (e.g. NaN).

What is the best behaviour here?

  • throw a Diverged exception?
  • stop iterating?
  • try to recover via heuristics?
  • just continue and let the user find out via inspection of info?

rmsprop final steps

I'm slightly confused about the final steps described in the doc vs the code below, should the Nesterov momentum be applied before updating the parameters, i.e.: self.wrt -= step1 + step2

        step1 = step_m1 * self.momentum
        self.wrt -= step1
        gradient = self.fprime(self.wrt, *args, **kwargs)

        self.moving_mean_squared = (
            self.decay * self.moving_mean_squared
            + (1 - self.decay) * gradient ** 2)
        step2 = self.step_rate * gradient
        step2 /= sqrt(self.moving_mean_squared + 1e-8)
        self.wrt -= step2

        step = step1 + step2

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.