Comments (15)
Actually, this happens during fit: fm.fit(X_train, y_train)
from fastfm.
My X_train is "91457x415 sparse matrix of type
type "numpy.float64"
with 8773896 stored elements in Compressed Sparse Row format"
y_train is array([1, 1, 1, ..., 1, 1, 1])
y_train.shape is (91457,)
from fastfm.
Make sure that none on the columns contains only zero values.
edit: fix typo
from fastfm.
what is non?
from fastfm.
I don't see any column consisting just of zeros. If you'd like to replicate the problem, the data set is from
https://www.kaggle.com/c/bnp-paribas-cardif-claims-management/data
The code that gives me the trouble is this:
import pandas as pd
import numpy as np
from fastFM import als
from scipy import sparse
train = pd.read_csv("./input/train.csv")
target = train['target'].values
train=train.drop(['ID','target','v8','v22','v23','v25','v31','v36','v37','v46','v51','v53','v54','v63','v73','v75','v79','v81','v82','v89','v92','v95','v105','v107','v108','v109','v110','v116','v117','v118','v119','v123','v124','v128'],axis=1)
categoricalVariables = []
for var in train.columns:
typ=str(train[var].dtype)
if (typ=='object'): categoricalVariables.append(var)
train=train.drop(categoricalVariables,axis=1)
train=train.fillna(-1)
start=0
fin=100000
train=sparse.csc_matrix(np.array(train.iloc[start:fin,4:7]))
target[target<0.5]=-1
target=target[start:fin]
fm=als.FMClassification()
fm.fit(train, target)
from fastfm.
When there is 4:7 in train=sparse.csc_matrix(np.array(train.iloc[start:fin,4:7])), there is an error. If I make it 5:7, the error disappears which would point to column 4. However, 4:6 works fine.
from fastfm.
Do you get always the same error? The assert tells us that a none finite parameter value is calculated for the global offset (zero order) parameter.
https://github.com/ibayer/fastFM-core/blob/2ab4edbd403c4a5a7781cf861e1d8c3b2a87b3c5/src/ffm_als_mcmc.c#L172
This could be due to a all zeros column or something else that's wrong with the input data.
Does it work with the mcmc sampler?
from fastfm.
It does work with mcmc but doesn't work with als. The error is always the same.
from fastfm.
Is the dataset very unbalanced (much more -1 then 1)? als.FMClassification
uses probit regression which is more sensitive as logistic regress to high class imbalance. Can you construct a minimal example
(very few rows and columns)? You can export it with sklearn.datasets.dump_svmlight_file and post it here.
from fastfm.
Also, just a comment. When I look at the reference page in the documentation, there is no description of the predict function anywhere (except for fit_predict in the mcmc case). Should it be added for completeness?
from fastfm.
No, mcmc should not be used with predict
. see #40
from fastfm.
What about als and sgd? The only function that described in the reference is fit. There is no any version of predict at all in the reference.
from fastfm.
You are right that's indeed a bug in the sphinx files. Can you open a separate issue for that? The predict
function is implemented in https://github.com/ibayer/fastFM/blob/master/fastFM/base.py
. The derived classes inherit predict
but this doesn't show up in the reference.
from fastfm.
Here is a file with the issue. Data set dimensions are (50,20). If you take last 30 rows, it doesn't have an issue but all 50 rows do. The target variable is unbalanced with about 70% of ones.
from fastfm.
I also meet that problem, and there is not all zero column in my data
from fastfm.
Related Issues (20)
- pip install . is not working on Winodws HOT 1
- Illegal instruction (core dumped) in ALS HOT 5
- Can fastfm use mini batch? HOT 1
- Check pairs range failed when fitting BPR
- OverflowError: n_iter too high in bpr.FMRecommender HOT 1
- Need partial_fit HOT 1
- Can fastfm use multicore to speed up training? HOT 1
- Fit complaining about both dense/sparse HOT 2
- Recompile for python 3.7 HOT 7
- Input of fit() and return value of predict_proba() method
- Failure to install on Python3.8 HOT 8
- Fix simple typo: reommend -> recommend
- Import Error
- Compiling using OpenBLAS from anaconda
- Source file type in PyPi
- No coordinate descent solver available HOT 1
- Compilation error on macOS 11.2 ARM HOT 3
- Any plan to support py3.7+? HOT 1
- will it work for third order categorical features interaction ?
- will it work on windows OS?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fastfm.