Coder Social home page Coder Social logo

pyfm's People

Contributors

chezou avatar coreylynch avatar doomotw avatar thonic avatar tiagootto avatar tiagozortea avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pyfm's Issues

Why are the descriptors weighted by area before projection?

This is the line of projection
return self.eigenvectors[:,:k].T @ (self.A @ func)
If I understand correctly, the basis itself is orthogonal, and are the solution to lambda*L@x = A@x, where L is cotangent weights and A are area weights.
So, isn't the matrix multiplication with A redundant here? Since our goal is just to project the descriptors onto a basis set for dimensionality reduction reasons.

sklearn\cross_validation deprecation warning

sklearn 0.19.1

lib\sklearn\cross_validation.py:41: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20.
  "This module will be removed in 0.20.", DeprecationWarning)

How to save the model

Since the computation of the model is time-consuming, is there any way to save the model for later prediction?

Can't install pyfm

Hi
I am Installing using pip install git+https://github.com/coreylynch/pyFM
Gave me the error below, any ideas?
Error compiling Cython file:

...
& validation_sample_weight)
self._sgd_lambda_step(validation_x_data_ptr, validation_x_ind_ptr,
validation_xnnz, validation_y)
if self.verbose > 0:
error_type = "MSE" if self.task == REGRESSION else "log loss"
print "Training %s: %.5f" % (error_type, (self.sumloss / self.count))
^

Why is my training rusult (log loss) is always 0

Creating validation dataset of 0.01 of training for adaptive regularization
-- Epoch 1
Training MSE: nan
-- Epoch 2
Training MSE: nan
-- Epoch 3
Training MSE: nan
-- Epoch 4
Training MSE: nan
-- Epoch 5
Training MSE: nan
-- Epoch 6
Training MSE: nan
-- Epoch 7
Training MSE: nan
-- Epoch 8
Training MSE: nan
-- Epoch 9
Training MSE: nan
-- Epoch 10
Training MSE: nan

bug in classification prediction

It seems there is a bug in the pyfm_fast.pyx within the prediction part for classification tasks:

In the _predict method, the outcome is basically calculated in line 252 using the predict_instance method. predict_instance is evaluating the FM model and then scales the result with the sigmoid function for classification in _scale_prediction (line 179), which is fine and which is also needed also for the training part.
The problem I see, is that this sigmoid transformation is done again in line 252, which basically means, that _predict always returns values > 0.5 because we apply the sigmoid twice.
There is no effect for the regression taks, due to the difference handling for classification/regression within _scale_prediction.

What do you think?

install error

I tried pip but i got this:

LINK : fatal error LNK1181: cannot open input file 'm.lib'
error: command 'C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\x86_amd64\link.exe' failed with exit status 1181

Incorrect format

I am trying to use libFM n the Frappe dataset. However, I get the following error on running the code:

Original exception was:
Traceback (most recent call last):
File "fm.py", line 19, in
(train_data, y_train, train_users, train_items)=loadData("traindata.mat")
File "fm.py", line 11, in loadData
for line in f:
File "/usr/lib/python3.5/codecs.py", line 321, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xda in position 133: invalid continuation byte

Is there some problem in the input format of the training and/or test dataset?
My training and test dataset are in the .mat format

can't save and load model correctly

I tried to save trained FM model and load it by using pickle.
I could save and load it , but loaded model predicted anomalous values like all zero.
How do I save and load trained model correctly ?

may you update code to remove this: cross_validation.py:41: DeprecationWarning

in near future your code may be not running since : cross_validation.py:41: DeprecationWarning

it is output of your test case

from pyfm import pylibfm
from sklearn.feature_extraction import DictVectorizer
import numpy as np
train = [
{"user": "1", "item": "5", "age": 19},
{"user": "2", "item": "43", "age": 33},
{"user": "3", "item": "20", "age": 55},
{"user": "4", "item": "10", "age": 20},
]
v = DictVectorizer()
X = v.fit_transform(train)
print(X.toarray())
#[[ 19. 0. 0. 0. 1. 1. 0. 0. 0.]
#[ 33. 0. 0. 1. 0. 0. 1. 0. 0.]
#[ 55. 0. 1. 0. 0. 0. 0. 1. 0.]
#[ 20. 1. 0. 0. 0. 0. 0. 0. 1.]]
y = np.repeat(1.0,X.shape[0])
fm = pylibfm.FM()
fm.fit(X,y)
fm.predict(v.transform({"user": "1", "item": "10", "age": 24}))

C:\Users\sndr\Anaconda3\lib\site-packages\sklearn\cross_validation.py:41: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20.
"This module will be removed in 0.20.", DeprecationWarning)
[[19. 0. 0. 0. 1. 1. 0. 0. 0.]
[33. 0. 0. 1. 0. 0. 1. 0. 0.]
[55. 0. 1. 0. 0. 0. 0. 1. 0.]
[20. 1. 0. 0. 0. 0. 0. 0. 1.]]
Creating validation dataset of 0.01 of training for adaptive regularization
-- Epoch 1
Training log loss: 0.13187

so
this warning is for line
from pyfm import pylibfm

it is
C:\Users\sndr\Anaconda3\lib\site-packages\sklearn\cross_validation.py:41: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20.
"This module will be removed in 0.20.", DeprecationWarning)

thanks

When I try to train a model I get

AttributeError: 'numpy.ndarray' object has no attribute 'indptr'

From
dataset = CSRDataset(X.data, X.indptr, X.indices, y_i, sample_weight)

The training X frame should be a numpy dataset yeah?

Different values for same item?

After training the model, I get different values of the same item, when I enter more items to the predicting list? Is this supposed to happend or is it a bug?
For eksample:
print fm.predict(v.fit_transform({"user": "1", "item": "10", "age": 24}))

print fm.predict(v.fit_transform([{"user": "1", "item": "10", "age": 24},{"user": "1", "item": "12", "age": 24}]))

both have user 1 and item 10, however the ratings of those would be different...

pyFM installation fails

Only may you please share how to install it for python 3 on widows computer
I use this advice
#11

and get this message
Microsoft Windows [Version 10.0.14393]
(c) 2016 Microsoft Corporation. All rights reserved.
e:\factirizatoin machine including FFM\code\pyFM_master_May5>python setup.py build_clib
running build_clib
e:\factirizatoin machine including FFM\code\pyFM_master_May5>

however when I run it from python
this error happen

File "E:\factirizatoin machine including FFM\code\pyFM_master_May5\myexampl.py", line 1, in
  from pyfm import pylibfm
File "E:\factirizatoin machine including FFM\code\pyFM_master_May5\pyfm\pylibfm.py", line 4, in
  from pyfm_fast import FM_fast, CSRDataset

builtins.ImportError: No module named 'pyfm_fast'

my friend try to install this on pytohn2 but he gets this error

An exception has occurred, use %tb to see the full traceback.
SystemExit: usage: main.py [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
   or: main.py --help [cmd1 cmd2 ...]
   or: main.py --help-commands
   or: main.py cmd --help
error: option -f not recognized

Adding user feature reduce model performance?

Hi everyone,

I am trying to use ml-1m data to build a rs model for users. What is weird for me is that, the model has a better performance without using the user features. Did i do something wrong when adding the features or is this normal?

Fitting the dataset
dataset = Dataset() dataset.fit(users = (row['UserID'] for index,row in users_df.iterrows()), items = (row['MovieID'] for index,row in movie_df.iterrows()), user_features = set(user_features_flat))

Creating the interaction and feature matrix
(interactions, weights) = dataset.build_interactions((row['UserID'],row['MovieID'],row['rating']) for index,row in ratings_df.iterrows())
user_feature_matrix = dataset.build_user_features((row['UserID'], [row['Gender'],row['Occupation'],row['age_group']]) for index,row in users.iterrows())

Model with user features
model = LightFM(no_components=70, loss='warp',) model.fit(interactions, user_features=user_feature_matrix, item_features=None, sample_weight=None, epochs=70, num_threads=4)
p_k = evaluation.precision_at_k(model, test, k=10, user_features=user_feature_matrix, item_features=None, preserve_rows=False, num_threads=4, check_intersections=True).mean() p_k #0.14658715

Model without
model_cf = LightFM(no_components=70, loss='warp') model_cf.fit(interactions, user_features=None, item_features=None, sample_weight=None, epochs=70, num_threads=4)
p_k_cf = evaluation.precision_at_k(model_cf, test, k=10, user_features=None, item_features=None, preserve_rows=False, num_threads=4, check_intersections=True).mean() p_k_cf #0.1638668

Not converge when training?

You can see the rmse change as below, using the example from README.
Why does it not converge, and how could I fix it?

--- git/pyFM ‹master* ?› » python example.py 
Creating validation dataset of 0.01 of training for adaptive regularization
-- Epoch 1
Training RMSE: 0.49676
-- Epoch 2
Training RMSE: 0.44940
-- Epoch 3
Training RMSE: 0.44133
-- Epoch 4
Training RMSE: 0.43757
-- Epoch 5
Training RMSE: 0.43599
-- Epoch 6
Training RMSE: 0.43494
-- Epoch 7
Training RMSE: 0.43381
-- Epoch 8
Training RMSE: 0.43375
-- Epoch 9
Training RMSE: 0.43324
-- Epoch 10
Training RMSE: 0.43272
-- Epoch 11
Training RMSE: 0.43310
-- Epoch 12
Training RMSE: 0.43255
-- Epoch 13
Training RMSE: 0.43229
-- Epoch 14
Training RMSE: 0.43235
-- Epoch 15
Training RMSE: 0.43214
-- Epoch 16
Training RMSE: 0.43237
-- Epoch 17
Training RMSE: 0.43242
-- Epoch 18
Training RMSE: 0.43247
-- Epoch 19
Training RMSE: 0.43308
-- Epoch 20
Training RMSE: 0.44136
-- Epoch 21
Training RMSE: 0.44681
-- Epoch 22
Training RMSE: 0.44714
-- Epoch 23
Training RMSE: nan

Fatal error during the installation

I am trying to install the pyFM in my machine using python27. During the installation I am receiving the following error: LINK: fatal error LNK1181: cannot open input file 'm.lib'

What this error is stand out?

How to save the trained model?

I can't find the way for save the model?Could someone help to solve this? thx
now I can run like this:
fm = pylibfm.FM()
fm.fit(X,y)
fm.predict(v.transform({"user": "1", "item": "10", "age": 24}))
but can't to find the way to save the model

AttributeError: 'numpy.ndarray' object has no attribute 'indptr'

Creating validation dataset of 0.01 of training for adaptive regularization
Traceback (most recent call last):
File "suanfa.py", line 25, in
fm.fit(trainX,trainY)
File "/home/ks/anaconda3/lib/python3.5/site-packages/pyfm/pylibfm.py", line 181, in fit
X_train_dataset = _make_dataset(X_train, train_labels)
File "/home/ks/anaconda3/lib/python3.5/site-packages/pyfm/pylibfm.py", line 239, in _make_dataset
dataset = CSRDataset(X.data, X.indptr, X.indices, y_i, sample_weight)
AttributeError: 'numpy.ndarray' object has no attribute 'indptr'

FM's on simple Sklearn's boston data giving NaN's

This is giving errors, am I missing something?

from scipy import sparse
from sklearn.datasets import load_boston
import pylibfm

instantiate FM instance with 7 latent factors

fm = pylibfm.FM(num_factors=7, num_iter=6, verbose=True)

load dataset

boston = load_boston()

fit FM, making sure to wrap the ndarray as a sparse csr

fm.fit(sparse.csr_matrix(boston.data), boston.target)

Creating validation dataset of 0.01 of training for adaptive regularization
-- Epoch 1
Training log loss: nan
-- Epoch 2
Training log loss: nan
-- Epoch 3
Training log loss: nan
-- Epoch 4
Training log loss: nan
-- Epoch 5
Training log loss: nan
-- Epoch 6
Training log loss: nan

fm.v is also all nan.

Pycharm error: Process finished with exit code -1073741819 (0xC0000005)

I was using pyFM to classify a data set. When only 50% of the data were used, the program went on normally. But when I tried to use all the data (the amount was about 300,000), an error occurred in pycharm at fm.fit(): Process finished with exit code -1073741819 (0xC0000005). I wonder if it was running out of memory?

indptr not found

Train_x and Test_x are all scipy sparse data, the fm.fit() is running normally, while in predict it come up with the error " indptr not found" when call the function CSRDataset(), why this error not occurred in fit()?

About regularization

Hi,

From the code I see:

    # Regularization Parameters (start with no regularization)
    self.reg_0 = 0.0
    self.reg_w = 0.0
    self.reg_v = np.repeat(0.0, num_factors)

However, I don't see where the regularization parameters are updated. Moreover, after fitting the model, I used model.reg_v to output the regularization parameters, it gave me an array of zeros. I am wondering does the model impose regularization on the model parameters? Thanks

may you remove this warnings: pyfm_fast.c(3174): warning C4018: '<': signed/unsigned mismatch

there are many warnings doing installation
pyfm_fast.c(3174): warning C4018: '<': signed/unsigned mismatch

Microsoft Windows [Version 10.0.16299.371]
(c) 2017 Microsoft Corporation. All rights reserved.

E:\Factorisation machens\how to install\pyfm_installation\pyFM-master\pyFM-master>python setup.py build_ext --inplace
running build_ext
cythoning pyfm_fast.pyx to pyfm_fast.c
building 'pyfm_fast' extension
creating build
creating build\temp.win-amd64-3.6
creating build\temp.win-amd64-3.6\Release
C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\x86_amd64\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -IC:\Users\sndr\Anaconda3\lib\site-packages\numpy\core\include -IC:\Users\sndr\Anaconda3\include -IC:\Users\sndr\Anaconda3\include "-IC:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\INCLUDE" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.10240.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\8.1\include\shared" "-IC:\Program Files (x86)\Windows Kits\8.1\include\um" "-IC:\Program Files (x86)\Windows Kits\8.1\include\winrt" /Tcpyfm_fast.c /Fobuild\temp.win-amd64-3.6\Release\pyfm_fast.obj
pyfm_fast.c
c:\users\sndr\anaconda3\lib\site-packages\numpy\core\include\numpy\npy_1_7_deprecated_api.h(12) : Warning Msg: Using deprecated NumPy API, disable it by #defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION
pyfm_fast.c(3174): warning C4018: '<': signed/unsigned mismatch
pyfm_fast.c(3214): warning C4018: '<': signed/unsigned mismatch
pyfm_fast.c(3245): warning C4018: '<': signed/unsigned mismatch
pyfm_fast.c(3843): warning C4018: '<': signed/unsigned mismatch
pyfm_fast.c(3910): warning C4018: '<': signed/unsigned mismatch
pyfm_fast.c(3941): warning C4018: '<': signed/unsigned mismatch
pyfm_fast.c(4885): warning C4018: '<': signed/unsigned mismatch
pyfm_fast.c(4953): warning C4018: '<': signed/unsigned mismatch
pyfm_fast.c(4964): warning C4018: '<': signed/unsigned mismatch
pyfm_fast.c(5505): warning C4018: '<': signed/unsigned mismatch
pyfm_fast.c(5572): warning C4018: '<': signed/unsigned mismatch
pyfm_fast.c(5610): warning C4018: '<': signed/unsigned mismatch
pyfm_fast.c(5996): warning C4018: '<': signed/unsigned mismatch
pyfm_fast.c(14070): warning C4244: 'initializing': conversion from 'double' to 'float', possible loss of data
pyfm_fast.c(14076): warning C4244: 'initializing': conversion from 'double' to 'float', possible loss of data
C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\x86_amd64\link.exe /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:C:\Users\sndr\Anaconda3\libs /LIBPATH:C:\Users\sndr\Anaconda3\PCbuild\amd64 "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\LIB\amd64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.10240.0\ucrt\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\8.1\lib\winv6.3\um\x64" /EXPORT:PyInit_pyfm_fast build\temp.win-amd64-3.6\Release\pyfm_fast.obj "/OUT:E:\Factorisation machens\how to install\pyfm_installation\pyFM-master\pyFM-master\pyfm_fast.cp36-win_amd64.pyd" /IMPLIB:build\temp.win-amd64-3.6\Release\pyfm_fast.cp36-win_amd64.lib
pyfm_fast.obj : warning LNK4197: export 'PyInit_pyfm_fast' specified multiple times; using first specification
Creating library build\temp.win-amd64-3.6\Release\pyfm_fast.cp36-win_amd64.lib and object build\temp.win-amd64-3.6\Release\pyfm_fast.cp36-win_amd64.exp
Generating code
Finished generating code

E:\Factorisation machens\how to install\pyfm_installation\pyFM-master\pyFM-master>

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.