Coder Social home page Coder Social logo

shusentang / bdc2019 Goto Github PK

View Code? Open in Web Editor NEW
122.0 6.0 26.0 4.41 MB

2019**高校计算机大赛——大数据挑战赛 第三名解决方案

Home Page: https://www.kesci.com/home/competition/5cc51043f71088002c5b8840/content

License: MIT License

Jupyter Notebook 100.00%
machine-learning deep-learning data-mining competition feature-engineering

bdc2019's Introduction

background

2019**高校计算机大赛——大数据挑战赛

鸡你太美(初赛复赛均第三名)解决方案,包含全部代码、文档及答辩PPT

赛题描述:

搜索中一个重要的任务是根据query和title预测query下doc点击率,本次大赛参赛队伍需要根据脱敏后的数据预测指定doc的点击率,结果按照指定的评价指标使用在线评测数据进行评测和排名,得分最优者获胜。

任务分类:

  • 短文本匹配
  • 点击率预估

数据说明:

train_data.sample是官方给的训练样本示例,数据按列分割,分隔符为”,",为不带表头的CSV数据格式。数据格式如下:

列名 类型 示例
query_id int 3
query hash string,term空格分割 1 9 117
query_title_id title在query下的唯一标识 2
title hash string,term空格分割 3 9 120
label int, 取值{0, 1} 0

注意:提供的样本示例train_data.sample仅为帮助理解赛题以及调通代码,由于样本示例仅为两万行,因此构造的出来的特征意义不大(数据严重泄露)。

其他方案


感兴趣就给个star吧:-D

最后感谢两位队友@Han和@hcccccccc

bdc2019's People

Contributors

shusentang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

bdc2019's Issues

点击率特征

您好,我用线下两万数据来做,构造点击率特征,重要度排序排在很后边,而且会严重造成数据泄漏。请问这是因为我是在线下做的缘故吗?线上是不是跟线下差别很大?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.