Coder Social home page Coder Social logo

libftrl-python's Introduction

FTRL-Proximal

This is an implementation of the FTRL-Proximal algorithm in C with python bindings. FTRL-Proximal is an algorithm for online learning which is quite successful in solving sparse problems. The implementation is based on the algorithm from the "Ad Click Prediction: a View from the Trenches" paper.

Some of the features:

  • Uses Open MP to parallelize training, and hence is very fast
  • The python code can operate directly on scipy CSR matrices

Pre-requisites

Dependensies:

  • It needs: numpy, scipy and open mp
  • If you use anaconda, it already has numpy, scipy
  • to install GOMP_4.0 for anaconda, use conda install libgcc

Building

cmake . && make
mv libftrl.so ftrl/
python setup.py install

If you don't have cmake, it's easy to install:

mkdir cmake && cd cmake
wget https://cmake.org/files/v3.10/cmake-3.10.0-Linux-x86_64.sh
bash cmake-3.10.0-Linux-x86_64.sh --skip-license
export CMAKE_HOME=`pwd`
export PATH=$PATH:$CMAKE_HOME/bin

Example

import numpy as np
import scipy.sparse as sp

from sklearn.metrics import roc_auc_score

import ftrl

X = [
    [1, 1, 0, 0, 0],
    [1, 1, 1, 0, 0],
    [0, 1, 1, 1, 0],
    [0, 0, 1, 1, 1],
    [0, 0, 0, 1, 1],   
]

X = sp.csr_matrix(X)
y = np.array([1, 1, 1, 0, 0], dtype='float32')

model = ftrl.FtrlProximal(alpha=1, beta=1, l1=10, l2=0)

# make 10 passes over the data
for i in range(10):
    model.fit(X, y)
    y_pred = model.predict(X)
    auc = roc_auc_score(y, y_pred)
    print('%02d: %.5f' % (i + 1, auc))

We can also use it to solve the regression problem:

from sklearn.metrics import mean_squared_error

y = np.array([1, 2, 3, 4, 5], dtype='float32')

model = ftrl.FtrlProximal(alpha=0.5, beta=1, l1=0, l2=0, model_type='regression')

# make 10 passes over the data
for i in range(10):
    model.fit(X, y)
    y_pred = model.predict(X)
    mse = mean_squared_error(y, y_pred)
    print('%02d: %.5f' % (i + 1, mse))

Use case

This library was used for the Criteo Ad Placement Challenge and showed very competitive performance. You can have a look at the solution here: https://github.com/alexeygrigorev/nips-ad-placement-challenge

In particular, it performed significantly faster than sklearn's Logistic Regression (a wrapper for LIBLINEAR):

  • sklearn: 1.2 hours to train, auc=0.734
  • libftrl-python: 2 minutes to train, auc=0.734

libftrl-python's People

Contributors

alexeygrigorev avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.