Comments (8)
Try different values for the step_size
parameter. Unfortunately SGD suffers from vanishing / exploding gradients if the step_size
is not chosen properly. I would recommend to use the ALS solver which is more stable and usually faster.
from fastfm.
Thanks @ibayer.
A small step_size
works. I had to shrink it to around 1e-7
for SGD to work properly. BTW, why do you think ALS is usually faster? It seems to me SGD is always faster.
from fastfm.
BTW, why do you think ALS is usually faster? It seems to me SGD is always faster.
In my experience ALS needs less clock time to converge to the same quality as SGD even through one SGD iterations is much faster then one ALS iteration.
from fastfm.
In my experience ALS needs less clock time to converge to the same quality as SGD even through one SGD iterations is much faster then one ALS iteration.
What counts as one ALS iteration? Is it when every model parameter (i.e., each coordinate as in general coordinate descent) gets updated once?
I guess the actual time may depend on the data size and the particular problem (e.g., is convergence intrinsically difficult). On my fairly small, sparse data (~90000 instances, 100 effective non-zero dimension), SGD takes about 1/22/3 of the clock time needed by ALS to converge to comparable quality, as measured on some hold-out data. Of course, SGD needs a much bigger n_iter
though.
from fastfm.
What counts as one ALS iteration? Is it when every model parameter (i.e., each coordinate as in general coordinate descent) gets updated once?
That the definition that I'm using.
I guess the actual time may depend on the data size and the particular problem (e.g., is convergence intrinsically difficult).
Absolutely, it's also easy to construct a data set where SGD converges faster. Just oversample the original data heavily. This shouldn't influence SGD at all but slow down ALS quite a bit.
On my fairly small, sparse data (~90000 instances,
100 effective non-zero dimension), SGD takes about 1/22/3 of the clock time needed by ALS to converge to comparable quality, as measured on some hold-out data. Of course, SGD needs a much bigger n_iter though.
I'm still surprised. The ALS solver is fairly optimized and carefully profiled, the SGD solver not so much.
Are you using a public data set? Can you share the experiments?
from fastfm.
I'm using data that is not public, and for some reasons, I do not wish (and might not be allowed) to share at the moment.
I just had some quick runs with the two approaches without very thorough experiments, but I'm definitely willing to share them. When I'm more available later, I'll run a few more experiments, organize the result a bit, and share them here.
from fastfm.
That would be great.
from fastfm.
I'm closing this one as the original issue seem to be fixed after changing the sgd stepsize.
from fastfm.
Related Issues (20)
- pip install . is not working on Winodws HOT 1
- Illegal instruction (core dumped) in ALS HOT 5
- Can fastfm use mini batch? HOT 1
- Check pairs range failed when fitting BPR
- OverflowError: n_iter too high in bpr.FMRecommender HOT 1
- Need partial_fit HOT 1
- Can fastfm use multicore to speed up training? HOT 1
- Fit complaining about both dense/sparse HOT 2
- Recompile for python 3.7 HOT 7
- Input of fit() and return value of predict_proba() method
- Failure to install on Python3.8 HOT 8
- Fix simple typo: reommend -> recommend
- Import Error
- Compiling using OpenBLAS from anaconda
- Source file type in PyPi
- No coordinate descent solver available HOT 1
- Compilation error on macOS 11.2 ARM HOT 3
- Any plan to support py3.7+? HOT 1
- will it work for third order categorical features interaction ?
- will it work on windows OS?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fastfm.