Coder Social home page Coder Social logo

kiminh / parallelcdn Goto Github PK

View Code? Open in Web Editor NEW

This project forked from bianan/parallelcdn

0.0 2.0 0.0 379 KB

Parallelized Coordinate Descent Newton Method for Efficient L1-Regularized Minimization.

License: MIT License

Python 4.35% Makefile 0.61% C++ 92.83% C 2.21%

parallelcdn's Introduction

C/C++ implementation for PCDN, SCDN and CDN metioned in the paper:

Parallelized Coordinate Descent Newton Method for Efficient L1-Regularized Minimization. https://ieeexplore.ieee.org/abstract/document/8661743 https://arxiv.org/abs/1306.4080

Installation

============

On Unix systems, type

$ make

to build the train' and infer' programs. type

$ make clean

to clean the built files.

Run them without arguments to show the usages.

The software has been tested on Ubuntu 12.04 x86_64.

`train' Usage

============= Usage: train [options] training_file test_file [model_file_name]

options:

-a algorithm: set algorithm type (default 0) 0 -- CDN 1 -- Shotgun CDN (SCDN) 2 -- Parallel CDN (PCDN)

-s solver type : set type of solver (default 0) 0 -- L1-regularized logistic regression with bias term 1 -- L1-regularized L2-loss support vector classification

-c cost : set the parameter C (default 1)

-e epsilon : set tolerance of termination criterion |f^S(w)|_1 <= epsmin(pos,neg)/l|f^S(w0)|_1, where f^S(w) is the minimum-norm subgradient at w

-g g -n n : to generate the experimental results of CDN using a decreasing epsilon values = eps/g^i, for i = 0,1,...,n-1 (default g=1.0 n=1)

-q : quiet mode (no screen outputs)

training_file: training set file

test_file: test set file

model_file_name: model file name If you do not set model_file_name, it will be set as the result file nam e following ".model"

`infer' Usage

=============

Usage: infer test_file model_file output_file

test_file: test set file

model_file_name: model file name

output_file: output file name

Datasets Download

=================

Type

$ python ./gen_data.py

The script will defaultly download 1 data set (real-sim) from LIBSVM Data page. If you want to download more datasets, edit the "data_dict" in 'gen_data.py' to indicate data sets for generation. For those datasets, we do a 80/20 split for training and testing. It then stores *.train and *.test in the 'data' directory. Note that you need bunzip2, which is called by gen_data.py

Set #bundle_size, #threads

==========================

Edit line 121-123 of src/train.cpp :

int g_pcdn_thread_num = 0; //#threads for pcdn. default (set as 0): num_procs -1; otherwise, set as other positive integer int g_bundle_size = 1250; // bundle size for pcdn int g_scdn_thread_num = 8; // #threads for scdn

then type

$ make

The Log Files

=============

With each run, two log files will be stored in 'log/' directory, with the name indicating configuration of the specific experiment. For example,

'pcdn_threads_3_bundle_1250_s_0_c_4.0_eps_1e-3_real-sim'

'pcdn_threads_3_bundle_1250_s_0_c_4.0_eps_1e-3_real-sim_verbosity'

indicate: algorithm: pcdn, threads: 3, bundle size: 1250, slover: 0, C: 4.0, epsilon: 1e-3, dataset: real-sim.

The first log file stores the contents printed on the terminal, the second log file stores outputs of each iteration, which could be used to generate the experimental results.

Example

========

real-sim.train and real-sim.test are put as example dataset on the project webpage:

real-sim

bundle size: 1250

L1-regularized logistic regression with bias term:

$./train -a 2 -s 0 -c 4.0 -e 1e-3 ./data/real-sim.train ./data/real-sim.test model_lrb

$./infer ./data/real-sim.test model_lrb out_lrb

L1-regularized L2-loss support vector classification:

$ ./train -a 2 -s 1 -c 1.0 -e 1e-3 ./data/real-sim.train ./data/real-sim.test model_svc

$ ./infer ./data/real-sim.test model_svc out_svc

Copyright:

Copyright (2019) [Yatao (An) Bian [email protected] | yataobian.com]. Please cite the above paper if you use this code in your work.

parallelcdn's People

Contributors

yataobian avatar bianan avatar

Watchers

James Cloos avatar paper2code - bot avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.