Coder Social home page Coder Social logo

MCMC vs ALS about fastfm HOT 6 OPEN

ibayer avatar ibayer commented on May 28, 2024
MCMC vs ALS

from fastfm.

Comments (6)

ibayer avatar ibayer commented on May 28, 2024

Hyper parameter tuning is more art then science, make sure that you also tune, rank, n_iter and even
init_std (not all are equally important).

  1. The fact that MCMC has one l2_reg_V for each layer (rank) gives MCMC indeed an intrinsic advantage.
    The code could be extended to have the same number of l2_reg_V for ALS, but that would lead to a hard to tune model. Theoretically MCMC has other advantages (read about Bayesian linear regression) , but how much of a difference that makes depends...
  2. Hard to say, the badMCMC performance should depend a lot on n_iter and this is certainly not the intended way to use mcmc, but if it works...
  3. Have a look at our paper Sample selection for MCMC-based recommender systems, you can get it here without the pay-wall.

from fastfm.

merrellb avatar merrellb commented on May 28, 2024
  1. I am hoping to get a sense of, how "bad" is the badMCMC approach? Depending a lot on n_iter still seems a lot easier to optimize than the multiple ALS parameters. However if this is an intrinsically worse approach (loss of information, known weaknesses of the algorithm, etc) I would expect that the seemingly better performance is more of a reflection of (so far) insufficient ALS tuning.

  2. Thank you for the link. I look forward to reading the paper. However the links from your university website seem to all lead to the paywall (perhaps you have an account/cookie that lets you bypass this)?

  3. My goal may be unusual as I am using RMSE to measure performance and tune the model but I am actually most interested in V_ and the column (rank sized) of each feature. In the Movielens context I can use these for user similarity and movie clustering or in a Word2Vec context they can serve as word "embeddings". I would love to use MCMC for this (who doesn't want less tuning) but I am struggling to understand if I need to switch over to ALS or SGD if all I care about is a simple single vector for each feature.

from fastfm.

ibayer avatar ibayer commented on May 28, 2024
  1. Use what ever gives you the best result.
  2. link updated
  3. Go with ALS, MCMC doesn't really make sense for user / item embeddings (you would get a different embedding for each iteration). You can use MCMC for the hyper-parameter search if that helps in your case.
    Calling predict on an mcmc model gives you a "ALS" prediction using the hyper-parameter from the last mcmc chain.

from fastfm.

merrellb avatar merrellb commented on May 28, 2024
  1. This seems a bit in conflict with the answer to Question 4 where ALS is suggested as superior in this situation.
  2. I've done a hard reload on your webpage and it still seems to be linking to the paywall.
    4a) So you are saying predict on a well tuned ALS model should represent the embeddings better than predict on MCMC which must use the less meaningful parameters from the last mcmc chain?
    4b) Is there any sort of convergence going on where the parameters at the current end of the mcmc chain are approaching the parameters we might get from ALS (or at least moving towards stability)?

While I certainly will use whatever gives me best results I am new to the "art" of tuning the parameters. It isn't clear to me if the superior MCMC results I see are because I need to invest more effort into tuning ALS, or if MCMC can actually "beat" ALS even when used "improperly" (your response to Question 4 seems to suggest it shouldn't)?

Thanks for all your help.

from fastfm.

ibayer avatar ibayer commented on May 28, 2024
  1. This seems a bit in conflict with the answer to Question 4 where ALS is suggested as superior in this situation.

No, I didn't say superior " (you would get a different embedding for each iteration)"

  1. I've done a hard reload on your webpage and it still seems to be linking to the paywall.

works for me, here the link again: http://www.informatik.uni-konstanz.de/rendle/pub0/

4b) Is there any sort of convergence going on where the parameters at the current end of the mcmc chain are approaching the parameters we might get from ALS (or at least moving towards stability)?

No, it's not even clear how to define the end of a mcmc chain. :)

from fastfm.

merrellb avatar merrellb commented on May 28, 2024

Thanks, the paper link seems to work now (perhaps there was some caching earlier)

My apologies for the lack of familiarity with the terminology. Instead of the end of a mcmc chain I meant "the hyper-parameter from the last mcmc chain" which predict uses with mcmc. This approach seems to be strongly cautioned in the docs as "This evaluation is fast but usually of low quality." I am therefore surprised to see results that seem to consistently beat ALS with defaults (and some basic regularization tuning). I am trying to understand if this means that I need to tune ALS more (as it seems like it should be able to beat "low quality" predictions) or if the warning is giving me the wrong impression of the limitations of MCMC.

from fastfm.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.