Coder Social home page Coder Social logo

bert4rec_repro's People

Contributors

asash avatar chrisjune avatar mathslove avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

bert4rec_repro's Issues

Reproduce Bert4Rec's results on Beauty, and provide better SASRec results by improving the loss function.

@asash First, thank you for your great job on the sequential recommendation.

I have noticed that in your article, the results of BERT4Rec on the Beauty dataset cannot be replicated. After investigation, I found that it may be due to inconsistent preprocessing methods on the dataset. I used the Beauty dataset processed by S3Rec (which can replicate the results of SASRec on Beauty) for the experiment. Finally, the experimental results of BERT4Rec can be reproduced.

Inspired by you, I am trying to improve SASRec. As you said in the paper, the main difference between BERT4Rec and SASRec lies in the training objectives or loss function. So I tried to improve the loss function of SASRec. Finally, SASRec using the improved loss function can surpass BERT4Rec on the ML-1M and Beauty datasets, and achieve similar results on the ML-20M and Steam datasets.

I provide code (by fork) to reproduce what I said above. Finally, I hope my findings can be helpful to you.

Thank you again for your great work!

[GRU4Rec] GRU4Rec doesn't use GPU

I tried to experiment with different models and encountered the problem that GRU4Rec takes much longer to train than other NNs. So I checked the logs and saw warnings saying that GRU layers are not using GPU optimization.
WARNING:tensorflow:Layer gru will not use cuDNN kernel since it doesn't meet the cuDNN kernel criteria. It will use generic GPU kernel as fallback when running on GPU

[BERT4Rec, ALBERT4Rec] Intermediate size set to default and head number does not fit

As far as I understood, the model implementations for "ours" BERT4Rec and ALBERT4Rec tests on ML-1M use the model code located at recommenders/dnn_sequential_recommender/models. While the evaluation configs pass the intermediate_size parameter to models' constructors, it doesn't get propagated to the HF model config and remains default (3072 for BERT and 16384 for ALBERT). The configs also use the default head number, which is 2 for BERT4Rec (same as declared in the replicability paper) but 16 for ALBERT4Rec.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.