Coder Social home page Coder Social logo

sunnymarkliu / rnn-active-user-forecast Goto Github PK

View Code? Open in Web Editor NEW

This project forked from drop-out/rnn-active-user-forecast

0.0 1.0 0.0 19 KB

1st place solution for the Kuaishou Active-user Forecast competition

License: MIT License

Jupyter Notebook 63.97% Python 36.03%

rnn-active-user-forecast's Introduction

赛题回顾

这是一个活跃用户预测问题。给定快手用户注册、登陆、视频观看与发布、互动的记录,预测未来7天活跃用户。

详情可参见比赛页面

RNN: Many2One vs Many2Many

使用RNN,一般地会想到如下解决方案:以几天内的用户行为序列为输入,以未来七天该用户是否活跃为标签,标注该序列。这是一种Many2One的解决方案。

为了充分利用数据,需要对训练数据做大量的滑窗,以实现数据增广,计算成本高。另外,每个序列只有一个标签,梯度难以传导,导致训练困难。相反的,我们可以考虑Many2Many结构,即每个输入都对应输出之后7天是否活跃,充分利用监督信息,减轻梯度传到负担,使训练更加容易。

Many2One和Many2Many结构的简单对比如下。

Many2One Many2Many
无需滑窗
充分利用监督信息
变长序列

输入序列

相比xgboost的历史统计量为特征的解决方案,RNN无需对输入序列做过多处理,对各类行为序列直接输入即可。简单列表如下:

  • 当天是否登陆(0/1)
  • 当天观看次数(加1取对数)
  • 分action_type行为记录数(加1取对数)
  • 分page行为记录数(加1取对数)

Intercept

另外,在输出层直接做一个intercept拼接,将日期、device_type、register_type one-hot后输入。低频类别可归为一类。

Variable Length

因为序列是变长的,采用dynamic-RNN,每个batch中取相同长度的序列,不同batch长度不同,每次随机取某一长度的batch。

余弦退火快照集成

采用余弦退火快照集成,可以以极低的成本获得大量有差异的局部最优,最后再进行融合,能获得显著的提升。

rnn-active-user-forecast's People

Contributors

drop-out avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.