cjlin1 / liblinear Goto Github PK

View Code? Open in Web Editor NEW

995.0 995.0 342.0 4.7 MB

LIBLINEAR -- A Library for Large Linear Classification

Home Page: https://www.csie.ntu.edu.tw/~cjlin/liblinear/

License: BSD 3-Clause "New" or "Revised" License

C 39.23% C++ 40.93% Python 18.04% Makefile 1.28% MATLAB 0.52%

liblinear's People

Contributors

Stargazers

Watchers

Forkers

simsong liujiantong trivedigaurav larsmans hsinyuan-huang ucsd-vision zhangliliang qunluo rokm erikhvatum maluuba wangdequan garfielder007 faruto lishenghz lesliefire vastlab nvalerij zhchxi11 samuel1208 svebk dahai001 ericxsun cvml mangoliu chenbk85 abhi-kumar xn0507 chocolate9624 angleto bowrein snownus shawnhuang panglianmao yyaxxx crycrane phddone innerlee yhirose pinglmlcv arnabgho tomz fanzhw haolu86 tpys dprotopopov zhangsirm zbxzc35 nuanxinqing dyinpao maofei sericwong caomw misuke88 luoyetx yunguangwang891017 cmxnono ml-ai-nlp-ir robottomw sunmeng007 jasonshih shuaiyicao yhldhit cequencer sandy4321 toothacher17 sg47 riemannruiz titsuki ericeiffel le02146 athivvat xiamenwcy stevenlol westamine elvinxiao bermanmaxim zhenyisx clumsycat actank ericsimonzhu alphaprime perfectwzp jupiterethan chuckcho tomofumi-nakano hongjingz trista217 walkoncross ryomaniihara cv-ip xmur qingniufly daniel-perry wwwanghao huan2012 2php schemmy wangfeii iyefeng

liblinear's Issues

How to improve accuracy?

I'm running liblinear on text classification using -s 0 and -s 6 in order to get probability estimates of a multiclass classification task.

I read through the guides, sites, documents, feature scaling etc but the accuracy of each classifier is always low ? e.g. 14%

How do I further improve the accuracy of the classifier?

Any help most useful

After I compile the multicore version, I received this problem

Here is the issue, how can I saolve it?

Invalid MEX-file
'/home/kcho/Dropbox/ML/mii-stroke-deeplearn/013017_stroke_tissue_fate/liblinear-multicore/matlab/train.mexa64':
dlopen: cannot load any more object with static TLS.

Segmentation fault with liblinear 2.11

train in liblinear 2.11 crashes for me after applying the change in #36. The console output is below; observe the iteration counter. Valgrind detected an out-of-bounds read in linear.cpp:75-83 (axpy()) and adding assert( y->index > 0 ); as the first statement in the while loop aborts the program on execution.

christoph:/tmp/tmp.8mis3wq6Em$ ./liblinear-2.11/train -B 1 -e 0.0001 training.txt svm-model
...*
optimization finished, #iter = 37
Objective value = -0.161526
nSV = 19
..*
optimization finished, #iter = 27
Objective value = -0.084889
nSV = 12
...*.
optimization finished, #iter = 40
Objective value = -0.110262
nSV = 14
...*
optimization finished, #iter = 39
Objective value = -0.185151
nSV = 17
...*
optimization finished, #iter = 37
Objective value = -0.089334
nSV = 13
Segmentation fault (core dumped)
christoph:~$ g++ --version
g++ (Ubuntu 5.4.0-6ubuntu1~16.04.5) 5.4.0 20160609
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Having difficulty installing and using Python version of liblinear

Here's what I've done:

Downloaded the entire liblinear library.
In a terminal (on my mac), I cd to the downloaded folder.
Type make in the main folder, and then also in the /python folder. This generates a liblinear.so.3 file in the /python folder.
I then start an ipython session (2.7 running in a conda virtual env) in the /python folder.
Whether I type import liblinear or import liblinearutil, I get the following error: Exception: LIBLINEAR library not found.

If it helps: I need liblinear because I'm trying to use a different library that requires it.

Compilation error for multi-threaded version

I directly execute the "make" command in the unzipped directory, and got the following error:

g++ -Wall -Wconversion -O3 -fPIC -fopenmp -c -o newton.o newton.cpp
g++ -Wall -Wconversion -O3 -fPIC -fopenmp -c -o linear.o linear.cpp
linear.cpp: In member function 'virtual double l2r_erm_fun::fun(double*)':
linear.cpp:244:70: error: 'l2r_erm_fun::wTw' is not a variable in clause 'reduction'
#pragma omp parallel for private(i) reduction(+:wTw) schedule(static)
^
make: *** [linear.o] Error 1

The version I use is liblinear-multicore-2.43-2 from the official website.
How can I solve it, thank you.

The starter value of C in best C search for liblinear

Dear Professor:

Recently I noticed that in your auto C search code, you set the min_C with:

1.0 / (matrix.length * max([Σ (line's value) ^ 2 for line in matrix]))

Could you explain why set starter C with this value? Is there any mathematical reason behind this value? I didn't find any explanation in your https://www.csie.ntu.edu.tw/~cjlin/papers/liblinear.pdf

Regards

liblinear gives very different regression results compared with libsvm

Procedures

Using the simple regression dataset provided by libsvm http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/regression.html#eunite2001.
Train & predict using both libsvm & liblinear

svm-train -s 3 -t 0 -c 1 -p 0.1 -e 0.001 -h 0 eunite2001 model.1 && svm-predict eunite2001.t model.1 prediction.1

liblinear-train -s 11 -c 1 -p 0.1 -e 0.001 eunite2001 model.11 && liblinear-predict eunite2001.t model.11 prediction.11

liblinear-train -s 12 -c 1 -p 0.1 -e 0.001 eunite2001 model.12 && liblinear-predict eunite2001.t model.12 prediction.12

liblinear-train -s 13 -c 1 -p 0.1 -e 0.001 eunite2001 model.13 && liblinear-predict eunite2001.t model.13 prediction.13
3. The results are here:

libsvm	liblinear -s 11	liblinear -s 12	liblinear -s 13
754.219	711.818	714.293	655.209
735.951	695.675	703.196	651.262
745.716	606.048	601.496	628.192
756.885	721.134	721.481	652.914
758.048	704.657	705.966	644.363
758.296	703.099	703.878	644.147
756.88	680.706	688.226	629.164
753.174	681.003	682.531	631.114
733.147	666.063	668.37	617.042
743.909	606.234	599.665	605.601
...	...	...	...

Questions

To my understanding, "liblinear-train -s 13" is the best match for "svm-train -s 3 -t 0". Is that correct?
Why are the results so different? In general, which tool gives better result?

error while importing liblinear in python

Hi,
I am getting below error while trying to import liblinear in python. unable to figure out what is going wrong here.
AttributeError: /usr/lib/liblinear.so.1: undefined symbol: find_parameter_C

`train` fails to read input file

According to the LIBLINEAR website, liblinear and libsvm use the same input format. On my computer, train signals an error when it reads an input file that is successfully parsed by libsvm. I traced the different behavior to line 395 in train.c:

		inst_max_index = 0; // strtol gives 0 if wrong format

In libsvm, the same variable is initialized to -1 and changing inst_max_index allows train to read the file successfully, too.

Error on 'make' under 'python/' for x86_64 unix

I'm getting

make -C .. lib
if [ "Darwin" = "Darwin" ]; then \
		SHARED_LIB_FLAG="-dynamiclib -Wl,-install_name,liblinear.so.3"; \
	else \
		SHARED_LIB_FLAG="-shared -Wl,-soname,liblinear.so.3"; \
	fi; \
	c++ ${SHARED_LIB_FLAG} linear.o tron.o blas/blas.a -o liblinear.so.3
ld: archive has no table of contents file 'blas/blas.a' for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[1]: *** [lib] Error 1
make: *** [lib] Error 2

when trying to make under python/ is there any solution for that? Running on mac mojave

The output after training

Why is it that after the training, a lot of iteration information is output, but according to the example, only one should be output.
When I execute train(x, y), a lot of information appears on the console, as follows:

.......*
optimization finished, #iter = 74
Objective value = -61.281319
nSV = 103
.......**
optimization finished, #iter = 73
Objective value = -54.324909
nSV = 91
..*
optimization finished, #iter = 26
Objective value = -2.698234
nSV = 15
.*
optimization finished, #iter = 12
Objective value = -2.446329
nSV = 11

Infinite loop or never returns for logistic regression in nearly degenerate case using scikit learn

Description

When using scikit learn, Logistic Regression never returns on fitting with nearly degenerate data.
Scikit learn passed the blame on to liblinear.

Steps/Code to Reproduce

import sklearn.linear_model
import numpy as np
model = sklearn.linear_model.LogisticRegression()
num_pts = 15
x = np.zeros((num_pts*2, 2))
x[3] = 3.7491010398553741e-208
y = np.append(np.zeros(num_pts), np.ones(num_pts))
model.fit(x, y)

Expected Results

Return or throw error.

Actual Results

Never returns.

Versions

Linux-2.6.32-573.18.1.el6.x86_64-x86_64-with-redhat-6.7-Carbon
('Python', '2.7.12 |Anaconda 2.0.1 (64-bit)| (default, Jul 2 2016, 17:42:40) \n[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)]')
('NumPy', '1.11.0')
('SciPy', '0.17.0')
('Scikit-Learn', '0.17.1')

Effect of normalization

When I use L1R_LR, a discrimination ratio changes depending on a normalization method.
I tried two normalization method.

Centering(Subtract average from each data)
Quick implementation, but discrimination ratio is below.
Normalize data with N(0,1).
Long implementation, but discrimination ratio is higher than 1.

I read the program, but I did not find a reason.

Liblinear output

Hello,

I am using liblinear to do an SVM classification and I am seeing output during training that I don't understand. Specifically,

...
iter 259 act 2.666e-02 pre 2.666e-02 delta 1.443e-03 f 8.453e+03 |g| 3.771e+01 CG   2
cg reaches trust region boundary
iter 260 act 3.524e-02 pre 3.522e-02 delta 1.451e-03 f 8.453e+03 |g| 7.068e+01 CG   3
cg reaches trust region boundary
iter 261 act 2.766e-02 pre 3.918e-02 delta 1.143e-03 f 8.453e+03 |g| 5.879e+01 CG   3
cg reaches trust region boundary
iter 262 act 3.855e-02 pre 3.855e-02 delta 1.299e-03 f 8.453e+03 |g| 1.061e+02 CG   3
cg reaches trust region boundary
iter 263 act 2.558e-02 pre 2.558e-02 delta 1.316e-03 f 8.453e+03 |g| 4.086e+01 CG   2
cg reaches trust region boundary
iter 264 act 3.885e-02 pre 3.885e-02 delta 1.442e-03 f 8.453e+03 |g| 1.712e+02 CG   3
...

I haven't been able to find documentation anywhere for what these values mean. Is it bad that the "trust boundary" is reached? Does it mean training isn't working? In the first few iterations the trust boundary is not always reached but later in training it seems to be. Is there a resource anywhere that can help me understand?

Thanks!
Jordan

Invalid Mex-file error

I'm using Matlab 2015b. My compiler seems to working properly as I've compiled other .c files and the make command returns without errors. I don't remember experiencing these problems on my last computer setup which used Matlab 2015a.

Thanks for your help!

Invalid MEX-file
'/home/mensen/matlab_toolboxes/liblinear-multicore-2.1-2/matlab/train.mexa64':
dlopen: cannot load any more object with static TLS

Error in mvpa_train>classif (line 39)
        model = train(Y, sparse(double(X)), ['-s '
        type ' -q -c ', num2str(best_lambda)]);

Where is the model file?

In the Ubuntu terminal, I typed:

./train -s 2 -v 5 -e 0.001 -q train1.txt model1

Where "train1.txt" is my train sample file .

The result is:

Cross Validation Accuracy = 91.5398%

But I didn't find any file named "model1" in the current directory. What's the matter?

Can I use this library for the ranking task?

Hi, I'm a computer science student based in Milan.
I want to know if I can use this library (especially, with the Python interface/wrapper) for the ranking task. I want to learn a ranking function in Learning to Rank style.
It is possible?

Thanks for the answer!

LIBLINEAR library not found on windows!!!

I want use liblinear with python, but there find an error "LIBLINEAR library not found" with "from liblinear import *". However, the path of liblinear.dll is right, and it can be successful on linux, how to solve it?

Error on 'make' under 'python/' for ubuntu

Tried to install in a VM running 16.04.1-Ubuntu, got a different error this time:

>>>sudo make
make -C .. lib
make[1]: Entering directory '/usr/local/liblinear-2.21'
g++ -Wall -Wconversion -O3 -fPIC -c -o linear.o linear.cpp
g++ -Wall -Wconversion -O3 -fPIC -c -o tron.o tron.cpp
make -C blas OPTFLAGS='-Wall -Wconversion -O3 -fPIC' CC='cc';
make[2]: Entering directory '/usr/local/liblinear-2.21/blas'
cc -Wall -Wconversion -O3 -fPIC -c dnrm2.c
cc -Wall -Wconversion -O3 -fPIC -c daxpy.c
cc -Wall -Wconversion -O3 -fPIC -c ddot.c
cc -Wall -Wconversion -O3 -fPIC -c dscal.c
ar rcv blas.a dnrm2.o daxpy.o ddot.o dscal.o
a - dnrm2.o
a - daxpy.o
a - ddot.o
a - dscal.o
ranlib blas.a
make[2]: Leaving directory '/usr/local/liblinear-2.21/blas'
if [ "Linux" = "Darwin" ]; then \
        SHARED_LIB_FLAG="-dynamiclib -Wl,-install_name,liblinear.so.3"; \
else \
        SHARED_LIB_FLAG="-shared -Wl,-soname,liblinear.so.3"; \
fi; \
g++ ${SHARED_LIB_FLAG} linear.o tron.o blas/blas.a -o liblinear.so.3
make[1]: Leaving directory '/usr/local/liblinear-2.21'
>>>```

Any hints on how to solve this?

Thanks.

Any example of usage from C++ code?

Is there any example of usage from C++ code?

Problem running the example for testing LIBLINEAR with instance weight support

I recently installed the LIBLINEAR and was testing the example scripts that were part of the README.weight

The following codes work until the problem() command

from liblinear.liblinearutil import *
import csv
y, x = svm_read_problem('./heart_scale', return_scipy=True)
W = [1] * len(y)
W[0] = 10
prob = problem(W, y, x) # Error occurs here

The error code is shown here:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ksong4/miniconda3/lib/python3.7/site-packages/liblinear/liblinear.py", line 194, in __init__
    tmp_xi, tmp_idx = gen_feature_nodearray(xi)
  File "/home/ksong4/miniconda3/lib/python3.7/site-packages/liblinear/liblinear.py", line 106, in gen_feature_nodearray
    raise TypeError('xi should be a dictionary, list, tuple, 1-d numpy array, or tuple of (index, data)')
TypeError: xi should be a dictionary, list, tuple, 1-d numpy array, or tuple of (index, data)

The imported x and y are both lists using the provided example heart_scale file.

Pull requests not acted upon

Hi. Do you intend to act on the pull requests?

Training and Accuracy issue

I am computer science student from India.
I am used to play with SVM implementation of liblinear from sklearn library in python.
but recently I started converting my code from python to C++ and used LIBSVMs C_SVC it works perfectly for me giving me above 97% of accuracy.

But my data set is very large and training time is very slow on LIBSVM so I moved on LIBLINEAR to obtain multi core performance for training. and it is creating more furious problem for me that I am getting accuracy only around 15%.

DATASET:

2,50,000 Images of 7 different classes
dimension 128 X 128 px
calculate HOG features of all images, length of 1 feature vector is 1296
X* = 250000 x 1296
Y = 250000
whole data set is normalised in 0-1 range.

I am not using command line interface of LIBLINEAR because training file is getting very big in GBs.
I am including liblinear and performed all necessary steps in order to use all the classes and functions of it.

now I have to classify all images into 7 different classes

I am using param.s=2 param.e=0.0001 don't need to set weight of different classes
and perform cross fold validation 70 for 2,50,000 images to find value of C
it gives me value of C about 4.76837e-07 and CV accuracy = 16.3265%

what should I do??
If I made any mistake please direct me on the correct path. thank you.

How to get 64bit dll for python in windows by Makefile.win

I want to use python interface for 64bit python in windows, but the liblinear.dll in the /windows directory seems to be 32bit one. And i can not generate 64bit dll by "nmake -f Makefile.win clean all" , this just generate exe file in /windows directory. So how can i do that?

Classification (Multi-class) problem

Hi,

I have to classifier my input data to multi class and I have trouble to use liblinear to classify it and give me the desired output can you point me please?

input data:
! COUNT !! LABEL !! PATTERN !! FEATURES
! 0.1 !! 42 !! NOUN DE NOUN !! 0:millión 1:de 2:euro
! 0.8 !! 43 !! NOUN DE NOUN !! 0:millión 1:de 2:euro
! 0.1 !! 44 !! NOUN DE NOUN !! 0:millión 1:de 2:euro
! 0.6 !! 42 !! NOUN DE NOUN !! 0:umbral 1:de 2:pobreza
.....

the desired output should look like l that, which is a list of weights for each feature/label
0:umbral/42 = 1.0054
1:de/42 = 0.0
2:pobreza/42 = 1.014
0:umbral/43 = 1.0044
1:de/43 = 0.0
2:pobreza/43 = 1.004
....

thanks for your time to guide me.

Makefile should use CXXFLAGS when calling CXX

The Makefile passes $(CFLAGS) to $(CXX). It should pass $(CXXFLAGS) instead.

Furthermore, it defines its own CFLAGS, rather than honoring flags the user might supply when invoking make. This causes the build to fail if the user wants to supply flags that are required for the build to succeed.

This was discovered because liblinear is included in nmap. See the nmap bug report here: nmap/nmap#1161 and the MacPorts project's bug report here: https://trac.macports.org/ticket/53995

Cannot import even after make

I am trying to use liblinear for python. I use Ubuntu 20.04 and I ran make in the python folder.

I’m trying to import it by running
from liblinearutil import *
However, I am still unable to successfully import it outside the liblinear directory. How can I fix this? Thank you

Add probability estimates for SVC

LibSVM currently has probability estimates for linear SVC. Could this be implemented in Liblinear aswell?

Error on make in MacOS

I am one of the mac user. However, when I typed make, this error occurred

clang: warning: no such sysroot directory: '/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk' [-Wmissing-sysroot]
libsvmread.c:1:10: fatal error: 'stdio.h' file not found
#include <stdio.h>
^~~~~~~~~
1 error generated.
warning: mkoctfile: building exited with failure status
Error: /opt/local/share/octave/6.2.0/m/miscellaneous/mex.m failed (line 54)
mex: building exited with failure status
=> Please check README for detailed instructions.

does anyone know the reason why? I installed the linear file onto the readable directory, but this error occurred. I don't know what is the problem. THanks

python cost too much memory when reading

ModuleNotFoundError when using multiprocessing

I was trying to implement a custom classifier in scikit-learn using the liblinear library, and cross-validate using the cross_validate() method which can run multiple folds in parallel. It gives this error when running in parallel:

joblib.externals.loky.process_executor._RemoteTraceback:
"""
Traceback (most recent call last):
  File "/home/akee511/src/emotion/.venv/lib/python3.8/site-packages/joblib/externals/loky/process_executor.py", line 407, in _process_worker
    call_item = call_queue.get(block=True, timeout=timeout)
  File "/usr/lib/python3.8/multiprocessing/queues.py", line 116, in get
    return _ForkingPickler.loads(res)
  File "/home/akee511/src/emotion/src/ertk/sklearn/models/mtl.py", line 1, in <module>
    import liblinear.liblinearutil as liblinearutil
ModuleNotFoundError: No module named 'liblinear.liblinearutil'; 'liblinear' is not a package
"""

I have determined that it's due to adding the directory to sys.path in liblinearutil.py which seems to be unnecessary.

liblinear/python/liblinear/liblinearutil.py

Line 4 in 5b973b2

sys.path = [os.path.dirname(os.path.abspath(__file__))] + sys.path

If you remove that line the error does not occur while the code still works fine.

Win7 python liblinear crash

When I run the code below:

y, x = [3, 3, 3], [{0: 1, 1: 1, 2: 1, 3: 1, 4: 1, 5: 1, 6: 1, 7: 1, 8: 1, 9: 1, 10: 1, 11: 1, 12: 1}, {0: 1, 1: 1, 2: 1, 4: 1, 5: 1, 6: 1, 7: 1, 8: 1, 10: 1, 11: 1, 13: 1, 14: 1, 15: 1}, {5: 1, 7: 1, 8: 1, 10: 1, 11: 1, 13: 1, 14: 1, 16: 1, 17: 1, 18: 1, 19: 1, 20: 1, 21: 1}]
prob = problem(y, x)
param = parameter('-c 4')
m = train(prob, param)

I got （pycharm）python.exe crash

Traceback (most recent call last):
  File "D:/ProgramData/pycharm/workspace/Test/SlotFilling/main.py", line 95, in <module>
    model.fit()
  File "D:\ProgramData\pycharm\workspace\Test\SlotFilling\SVM.py", line 66, in fit
    m = train(prob, param)
  File "D:\ProgramData\Anaconda3\lib\site-packages\liblinearutil.py", line 155, in train
    m = liblinear.train(prob, param)
OSError: exception: access violation reading 0x00000C084B86D008

Why ? Pls help me, THANKS

Python3 Support

Hello,

I was curious is there will be python 3 support given that python 2 is EOL soon?

Is there a way to read out support vectors?

Hi, just want to know how can I read the support vectors out (in python interface)? Thanks.

Malware found in /windows/train.exe

Doing a routine scan of /windows/train.exe with Bitdefender revealed "Cloud.Malware.2158qCW@bmgxWG". My scans show that the malware is present since f0b2b38.

Use liblinear on windows

I installed liblinear through pip install liblinear successfully. But got error ModuleNotFoundError: No module named 'liblinear'.

I download the zip file from github and included the ../python dir into sys.path, the import error still remains.

How should I install it on windows for python. Thanks!

solve_l1r_l2_svc() uninitialized variable

Hi, thanks for a great tool. With regard to this compilation warning:

linear.cpp: In function `void train_one(const problem*, const parameter*, double*, double, double)':
linear.cpp:1365: warning: 'loss_old' might be used uninitialized in this function

Can I just initialize "loss_old" to 0? Thanks.

Version numbers not monotonically increasing

Josephs-MacBook-Pro:~ joe$ brew irb 
==> Interactive Homebrew Shell
Example commands available with: brew irb --examples
irb(main):001:0> Version.new("2.2") > Version.new("2.11")
=> false
irb(main):002:0>

This will cause a problem upgrading the Homebrew formula. Would it be possible to rename the release and tarball 2.20 instead of 2.2 or something along those lines?

LibLinear giving low predictions for a feature that only appears in one class. Why?

So if I ran into an issue with liblinear where a feature present in class A is not biasing to that class. Isn't that weird?

If I incremented the weight from 1 to 10 for all features, then the prediction is better. Why?

More info here: https://stackoverflow.com/questions/60017644/is-there-a-reason-why-a-feature-only-present-in-a-given-class-is-not-being-predi

Adding support to build with cmake

It would be nice if the project also included cmake rules for the following reasons:

easier to build in Windows
easier to integrate in other projects that already use cmake (alternatively, providing pkg-config files)

Adding support for cmake does not require dropping the existing the Makefiles.

There's been two PR that tried this already. There was #42, which unfortunately mixed the addition of cmake with other non-desired features and so was closed. And there is #26 which is now 5 years old and not been acted on. I also made my own cmake rules (it's pretty simple).

Would be nice if this feature could be added. I can take a look at this, no problem, if there is is interest on this feature. Please let me know.

Is synthetic bias feature added when using liblinear as a library?

I am using a java version of liblinear which was autogenerated from the C++ code, so I am a little conflicted about where to ask this question, so I figured why not ask in both projects? The documentation on bias is a little unclear to me. I can see that when using the provided command-line executables, an additional bias term is automatically added to the feature set, but from what I can tell this does not appear to be the case if you use liblinear as a library for training and/or testing and prediction.

Intuitively one would assume that setting the bias param to some value would have an effect on the model, but it seems to me that if you set bias to 1, this would do nothing. In the executables the bias term is used as the synthetic feature's value, but if you are required to manually add the synthetic feature when you write your own code, is there any point in even setting bias? One could set bias to 1.0 and add a synthetic feature with any value. Additionally, it appears that this behavior, coupled with the way in which the dot product is calculated makes it difficult to even know when there is an issue as the model will typically work regardless of the input size (as I recall, it seems that features are simply being searched for as needed, and due to the sparse dot product missing features are ignored).

Maybe I am completely wrong in my analysis of the code, I am very rusty with C++. But it seems like the library does not take ownership of managing the bias despite that behavior being heavily implied. When constructing a problem it is suggested to set the bias param, manually increase the number of features, and then also manually add the feature to the inputs. So it seems the data format for a problem and of a model allows for an inconsistent state--where you are requesting bias be used, but not actually getting it or using some value other than the bias term as the synthetic feature

bwaldvogel/liblinear-java#42

How to get intermediate alpha(a) when solving dual logistic regression

Hi there, I am wondering how I can get the intermediate alpha(a) when solving dual logistic regression, with L2 regularizer for example.

I am using Python interface but it seems the output model doesn't contain alpha information anymore.
Really appreciate any kind of help here.

coefficients larger than 1

I wonder, if coefficients can be larger than 1 even on the normalized data ?
I am bench making liblinear with glmnet, and I see one of the coefficients is larger than one.

Thanks.

Merge of Liblinear with sample weights version

Hi, I would like to work on merging the sample weights version. Can you provide some guidance in terms of your own requirements?
My final goal is having a sample weights version in R (based on https://cran.r-project.org/web/packages/LiblineaR/index.html)
I have done the majority of the work,
but there are a few sticking points:

a) adding sample weights 'breaks' the interface for eg matlab and python versions
b) should one change the code to always use sample weights, or is the added computational cost too great?
c) currently I have merged the c++ code using conditional compilation because of b), however, this does not work for python and R (no conditional compilation), so it raises the worry of python/R code calling the library with the wrong compilation options.

eg I could

fatal error C1083: “math.h”: No such file or directory

When I compile liblinear in Windows 10 with VS2019, I get the following error:

D:\liblinear-2.30>nmake -f Makefile.win lib

    cl.exe /nologo /O2 /EHsc /I. /D _WIN64 /D _CRT_SECURE_NO_DEPRECATE -c tron.cpp

tron.cpp
tron.cpp(1): fatal error C1083: 无法打开包括文件: “math.h”: No such file or directory
NMAKE : fatal error U1077: “"C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.26.28801\bin\HostX86\x86\cl.exe"”: 返回代码“0x2”
Stop.

How to fix it?

Subproblem negative class weight is always set to 1 in OVR classification

It's possible to set both negative and positive class weights in a binary classification scenario. However, when a multi-class classification model is trained using a OVR solver, it's possible to set weight only for the positive (i.e. One) class, but the weight for the negative (i.e. Rest) class is always set to 1.

The difference can be seen in https://github.com/cjlin1/liblinear/blob/master/linear.cpp#L2552 where train_one uses both weights and https://github.com/cjlin1/liblinear/blob/master/linear.cpp#L2578 where param->C is used. That corresponds to always using 1 as a weight.

That doesn't allow class weight normalization, and unnormalized weights bias the C term.

How to get 'coef_*SV(W)' value using liblinear package in python?

i need to get the value of coef(W) using liblinear

from sklearn.svm import LinearSVC
x_train=x_train.reshape(500,784)
#y_train=np.argmax(y_train,axis=1)

from sklearn.svm import LinearSVC,SVC
clf_weights =LinearSVC(random_state=0, tol=.01)
clf_weights.fit(x_train,y_train)

LinearSVC(C=1.0, class_weight=None, dual=True, fit_intercept=True,
intercept_scaling=1, loss='squared_hinge', max_iter=1000,
multi_class='ovr', penalty='l2', random_state=0, tol=0.01, verbose=0)

[[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]
...
[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]]

print(clf.intercept_)
output:

[-0.1940297 0.03668576 -0.34181232 -0.27452097 -0.10010669 -0.27315419
-0.29847159 -0.19633 -0.90375486 -0.29134017]
sv=clf_weights.support_vectors_
print(clf_weights.support_vectors_)
coef=clf_weights.coef_

i got coefficient as (45,784) and support vectors of (336,784)

also how i can get coef (W)

##########################################

is this correct?
for i in range(45):

w=w[i]+coef[i]*sv[i,:]

print(w)

im getting output with zeros
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]

Potential Integer Overflow vulnerability in linear.cpp

Hi,
It seems that there exists a potential integer overflow. Please find the following description:

nr_feature can be an arbitrary large number

liblinear/linear.cpp

Line 3539 in 60f1adf

FSCANF(fp,"%d",&nr_feature);

liblinear/linear.cpp

Line 3570 in 60f1adf

nr_feature=model_->nr_feature;
n is 1+nr_feature

liblinear/linear.cpp

Line 3572 in 60f1adf

n=nr_feature+1;
w_size is n

liblinear/linear.cpp

Line 3575 in 60f1adf

int w_size = n;
Call to malloc with the large integer can cause a memory allocation with an overflowed size

liblinear/linear.cpp

Line 24 in 60f1adf

#define Malloc(type,n) (type *)malloc((n)*sizeof(type))

liblinear/linear.cpp

Line 3582 in 60f1adf

model_->w=Malloc(double, w_size*nr_w);

OnevsOne updated library?

Hello,

I am trying to implement OvO multiclass logistic regression classifier in MATLAB, but the version on the webpage for the OvO code is older than the current version of LIBLINEAR (2.20 vs 2.30). Would it be possible to obtain an updated version for this code? Thanks in advance.

how to handle multi-classification using one-vs-rest method?

I am a little confusing while using this package for multi-classification. can anyone tell me how to do it ? Thanks.

what i had try:

train_labels=[[1,2], [2], [3]]
train_datas = [[1,1,0], [1,2,2], [1,1,1]]
prob = problem(train_labels, train_datas)
param = parameter('-s 0')
model = train(prob, param)

but it arise some errors:
Traceback (most recent call last):
File "C:\Users\Jiaming\Dropbox\Internship in ADSC\DeepWalk\experiments\classifier.py", line 69, in process
prob = problem(train_labels, train_datas)
File "C:\Users\Jiaming\Anaconda2\lib\site-packages\liblinear-210-py2.7.egg\liblinear\liblinear.py", line 107, in init
for i, yi in enumerate(y): self.y[i] = y[i]
TypeError: a float is required

cjlin1 / liblinear Goto Github PK

liblinear's People

Contributors

Stargazers

Watchers

Forkers

liblinear's Issues

Procedures

Questions

Description

Steps/Code to Reproduce

Expected Results

Actual Results

Versions

Recommend Projects

Recommend Topics

Recommend Org