Coder Social home page Coder Social logo

zhyj3038 / fuxictr Goto Github PK

View Code? Open in Web Editor NEW

This project forked from reczoo/fuxictr

0.0 1.0 0.0 573 KB

A configurable, tunable, and reproducible library for CTR prediction, forked from https://github.com/huawei-noah/benchmark/tree/main/FuxiCTR

License: MIT License

Python 98.66% Shell 1.34%

fuxictr's Introduction

FuxiCTR

This is a fork from the official release at https://github.com/huawei-noah/benchmark/tree/main/FuxiCTR.

Click-through rate (CTR) prediction is an critical task for many industrial applications such as online advertising, recommender systems, and sponsored search. FuxiCTR builds an open-source library for CTR prediction, with stunning features in configurability, tunability, and reproducibility. It also supports the development of Open-CTR-Benchmark, making open benchmarking for CTR prediction available.

Model List

Publication Model Paper Available
WWW'07 LR Predicting Clicks: Estimating the Click-Through Rate for New Ads ✔️
ICDM'10 FM Factorization Machines ✔️
CIKM'15 CCPM A Convolutional Click Prediction Model ✔️
RecSys'16 FFM Field-aware Factorization Machines for CTR Prediction ✔️
RecSys'16 YoutubeDNN Deep Neural Networks for YouTube Recommendations ✔️
DLRS'16 Wide&Deep Wide & Deep Learning for Recommender Systems ✔️
ECIR'16 FNN Deep Learning over Multi-field Categorical Data: A Case Study on User Response Prediction ✔️
ICDM'16 IPNN Product-based Neural Networks for User Response Prediction ✔️
KDD'16 DeepCross Deep Crossing: Web-Scale Modeling without Manually Crafted Combinatorial Features ✔️
NIPS'16 HOFM Higher-Order Factorization Machines ✔️
IJCAI'17 DeepFM DeepFM: A Factorization-Machine based Neural Network for CTR Prediction ✔️
SIGIR'17 NFM Neural Factorization Machines for Sparse Predictive Analytics ✔️
IJCAI'17 AFM Attentional Factorization Machines: Learning the Weight of Feature Interactions via Attention Networks ✔️
ADKDD'17 DCN Deep & Cross Network for Ad Click Predictions ✔️
WWW'18 FwFM Field-weighted Factorization Machines for Click-Through Rate Prediction in Display Advertising ✔️
KDD'18 xDeepFM xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems ✔️
KDD'18 DIN Deep Interest Network for Click-Through Rate Prediction ✔️
CIKM'19 FiGNN FiGNN: Modeling Feature Interactions via Graph Neural Networks for CTR Prediction ✔️
CIKM'19 AutoInt+ AutoInt: Automatic Feature Interaction Learning via Self-Attentive Neural Networks ✔️
RecSys'19 FiBiNET FiBiNET: Combining Feature Importance and Bilinear feature Interaction for Click-Through Rate Prediction ✔️
WWW'19 FGCNN Feature Generation by Convolutional Neural Network for Click-Through Rate Prediction ✔️
AAAI'19 HFM+ Holographic Factorization Machines for Recommendation ✔️
Neural Networks'20 ONN Operation-aware Neural Networks for User Response Prediction ✔️
AAAI'20 AFN+ Adaptive Factorization Network: Learning Adaptive-Order Feature Interactions ✔️
AAAI'20 LorentzFM Learning Feature Interactions with Lorentzian Factorization ✔️
WSDM'20 InterHAt Interpretable Click-through Rate Prediction through Hierarchical Attention ✔️
DLP-KDD'20 FLEN FLEN: Leveraging Field for Scalable CTR Prediction ✔️
WWW'21 FmFM FM^2: Field-matrixed Factorization Machines for Recommender Systems ✔️

Dependency

FuxiCTR has the following requirements to install. While the implementation of FuxiCTR should support more pytorch versions, we currently perform the tests on pytorch 1.0.x-1.1.x only.

  • python 3.6.x
  • pytorch 1.0.x-1.1.x
  • pandas
  • numpy
  • h5py
  • pyyaml

Get Started

1. Run the demo

Please follow the examples in the demo directory to get started. The code workflow is structured as follows:

# Set the data config and model config
feature_cols = [{...}] # define feature columns
label_col = {...} # define label column
params = {...} # set data params and model params

# Set the feature encoding specs
feature_encoder = FeatureEncoder(feature_cols, label_col, ...) # define the feature encoder
feature_encoder.fit(...) # fit and transfrom the data

# Load data generators
train_gen, valid_gen, test_gen = data_generator(feature_encoder, ...)

# Define a model
model = DeepFM(...)

# Train the model
model.fit_generator(train_gen, validation_data=valid_gen, ...)

# Evaluation
model.evaluate_generator(test_gen)

2. Run the benchmark with given experiment_id

For reproducing the experiment result, you can run the benchmarking script with the corresponding config file as follows.

  • --config: The config directory of data and model config files.
  • --expid: The specific experiment_id that records the detailed data and model settings.
  • --gpu: The gpu index used for experiment, and -1 for CPU.

In the following example, DeepFM_test corresponds to an expid with specific model and dataset configurations located in config/model_config/tests.yaml.

cd benchmarks
python run.py --config ../config --expid DeepFM_test --gpu 0

3. Tune the model hyper-parameters

For tuning model hyper-parameters, you can apply grid-search over the specified tuning space with the following script.

  • --config: The config file that defines the tuning space
  • --tag: (optional) Specify the tag to determine which expid to run (e.g. 001 for the first expid). This is useful to rerun one specific experiment_id that contains the tag.
  • --gpu: The available gpus for parameters tuning (e.g., setting --gpu 0 1 for two gpus)

In the following example, FM_criteo_x4_tuner_config_01.yaml is a demo configuration file that defines the tuning space for parameter tuning.

cd benchmarks
python run_param_tuner.py --config ./FM_criteo_x4_001/FM_criteo_x4_tuner_config_01.yaml --gpu 0 1

For more running examples, please refer to the "Reproduce-Steps" of benchmarking results in Open-CTR-Benchmark.

Code Structure

Check an overview of code structure for more details on API design.

License

The MIT License

fuxictr's People

Contributors

zhujiem avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.