Coder Social home page Coder Social logo

deepinterestnetwork's Introduction

DeepInterestNetwork

Deep Interest Network for Click-Through Rate Prediction

Important Statement

This code is a demo to implement DIN on Amazon data. Unfortunately, the code quality is realy poor and there are many problems with this version of the code. Thus, we did major code refactoring and publish the new version of code in DIEN and XDL. We strongly recommend you to use DIEN. Moreover, the wrong way to apply BatchNorm in DIN brings a faulty experimental result. We fix the bug and report the new results of DIN on Amazon(Electro):

Model GAUC
PNN 0.8679
deepFM 0.8683
DIN 0.8698
DIN with Dice 0.8711

The updated training log can be found in DIN

Introduction

This is an implementation of the paper Deep Interest Network for Click-Through Rate Prediction Guorui Zhou, Chengru Song, Xiaoqiang Zhu, Han Zhu, Ying Fan, Na Mou, Xiao Ma, Yanghui Yan, Xingya Dai, Junqi Jin, Han Li, Kun Gai

Thanks Jinze Bai and Chang Zhou.

Bibtex:

@article{Zhou2017Deep,
  title={Deep Interest Network for Click-Through Rate Prediction},
  author={Zhou, Guorui and Song, Chengru and Zhu, Xiaoqiang and Ma, Xiao and Yan, Yanghui and Dai, Xingya and Zhu, Han and Jin, Junqi and Li, Han and Gai, Kun},
  year={2017},
}

Requirements

  • Python >= 2.6.1
  • NumPy >= 1.12.1
  • Pandas >= 0.20.1
  • TensorFlow >= 1.4.0 (Probably earlier version should work too, though I didn't test it)
  • GPU with memory >= 10G

Download dataset and preprocess

  • Step 1: Download the amazon product dataset of electronics category, which has 498,196 products and 7,824,482 records, and extract it to raw_data/ folder.
mkdir raw_data/;
cd utils;
bash 0_download_raw.sh;
  • Step 2: Convert raw data to pandas dataframe, and remap categorical id.
python 1_convert_pd.py;
python 2_remap_id.py

Training and Evaluation

This implementation not only contains the DIN method, but also provides all the competitors' method, including Wide&Deep, PNN, DeepFM. The training procedures of all method is as follows:

  • Step 1: Choose a method and enter the folder.
cd din;

Alternatively, you could also run other competitors's methods directly by cd deepFM cd pnn cd wide_deep, and follow the same instructions below.

  • Step 2: Building the dataset adapted to current method.
python build_dataset.py

We put a processed data 'dataset.pkl' in DeepInterestNetwork/din. Considering the GitHub's file size limit of 100.00 MB, we split it into 3 file aa ab ac.

cat aa ab ac > dataset.pkl


  • Step 3: Start training and evaluating using default arguments in background mode.
python train.py >log.txt 2>&1 &
  • Step 4: Check training and evaluating progress.
tail -f log.txt
tensorboard --logdir=save_path

Dice

There is also an implementation of Dice in folder 'din', you can try dice following the code annotation in din/model.py or replacing model.py with model_dice.py

deepinterestnetwork's People

Contributors

zhougr1993 avatar alimama-machine-learning-platform avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.