Coder Social home page Coder Social logo

colinsongf / 2016ccf_stategrid_userprofile Goto Github PK

View Code? Open in Web Editor NEW

This project forked from feidapeng/2016ccf_stategrid_userprofile

0.0 2.0 0.0 738 KB

1st Place Solution for【2016CCF大数据竞赛 客户画像赛题(用户画像)】

Python 100.00%

2016ccf_stategrid_userprofile's Introduction

1st Place Solution for 2016CCF StateGrid UserProfile

赛题链接:http://www.wid.org.cn/data/science/player/competition/detail/description/242

任务介绍

在复赛中,参赛者需要以电力用户的95598工单数据、电量电费营销数据等为基础,综合分析电费敏感客户特征,建立客户电费敏感度模型,对电费敏感用户的敏感程度进行量化评判,帮助供电企业快速、准确的识别电费敏感客户,从而对应的提供有针对性的电费、电量提醒等精细化用电服务。

数据下载

链接: https://pan.baidu.com/s/1miTsWI0 密码: 9ziw

解决方案

详细解决方案pdf戳这里

代码运行说明

按照95598工单记录次数对用户分为两类,分别构造特征和建模。

  • 将只有一条95598记录的用户定义为低敏感度用户,用A或者single指代
  • 将有多条95598记录的用户定义为高敏感度用户,用B或者multi指代

1.配置说明

程序依赖python3及以下程序包

  • anaconda3
  • xgboost
  • jieba

程序运行需要以下文件

/stopwords.txt  停用词表

请将原始数据放于下面目录中, 请确保都是utf-8编码格式

/rawdata/
    01_arc_s_95598_wkst_train.tsv
    01_arc_s_95598_wkst_test.tsv
    02_s_comm_rec.tsv
    09_arc_a_rcvbl_flow.tsv
    09_arc_a_rcvbl_flow_test.tsv
    train_label.csv
    test_to_predict.csv

其余目录作用

/code/  用于存放程序代码
/myfeatures/  用于存放程序运行生成的各种特征文件
/result/  用于存放最终的输出结果

2.运行

确认以上文件存在之后,依次运行:

python code/create_features_A.py    # 生成低敏感度用户的特征文件
python code/select_features_A.py    # 采用xgboost对低敏感度用户的文本特征进行筛选
python code/model_A.py              # 训练低敏感度用户的预测模型,及模型融合
python code/create_features_B.py    # 生成高敏感度用户的特征文件
python code/select_features_B.py    # 采用xgboost对高敏感度用户的文本特征进行筛选
python code/model_B.py              # 训练高敏感度用户的预测模型,及模型融合

3.输出文件说明

程序输出的结果包括特征文件和最终预测结果两部分:

myfeatures/
    statistical_features_1.pkl  低敏感度用户的统计特征
    text_features_1.pkl         低敏感度用户在表1中的ACCEPT_CONTENT文本信息
    single_select_words.pkl     低敏感度用户部分,采用xgboost选择的文本特征
    statistical_features_2.pkl  高敏感度用户的统计特征
    text_features_2.pkl         高敏感度用户在表1中的ACCEPT_CONTENT文本信息
    multi_select_words.pkl      高敏感度用户部分,采用xgboost选择的文本特征
    
result/                 
    A.csv               低敏感度用户中的电费敏感用户
    B.csv               高敏感度用户中的电费敏感用户
    result.csv          合并结果

其他

如果觉得不错的话,欢迎大家点击右上角star,谢谢!

我们参加的其他竞赛:

2nd Place Solution for SMP CUP 2016

2016CCF 大数据精准营销中搜狗用户画像挖掘 final winner solution

2016ccf_stategrid_userprofile's People

Contributors

feidapeng avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.