yilifzf / bdci_car_2018 Goto Github PK

View Code? Open in Web Editor NEW

424.0 424.0 160.0 83.23 MB

BDCI 2018 汽车行业用户观点主题及情感识别决赛一等奖方案

Python 55.13% Shell 0.06% HTML 10.81% Jupyter Notebook 34.01%

bdci_car_2018's People

Contributors

Stargazers

Watchers

Forkers

pandascute for-competition cooper111 zhangluoyang nipengmath peterboyyy laviliu luckmoon sparkingarthur caicui ethanww charlottesean moolighty chenghuige lazysisphus spurscoder helloqiong shangcaiwangtao cclauss fendaq shuangyumo snbetter haif-liu amoliu sunnymarkliu ericxsun microw jimmy-walker cdj0311 kongdzh allensmile szhl kai2020-hello digapieceofday awesome-archive yyxt11 gptcod jxlijunhao klauszhao yuanjie-ai demonsong betafringe ymingzhu qicst23 awesome-crawler dengminna genpeng caoxu915683474 hailiang-wang ansvver ginnyhan benderpan snowysunny smithleroy alvinwangjunqiang deepphysicvision 7472741 lyoshiwo supersush northeast250 tiffen fsducy zlszhonglongshen yuconan safly sxu-fyj liubin86410 pwolff narutogo jungao95 buerkobe yifeigee boluoyu by-sum wang-ii fengzifrank jack19861225 renq08 tianmh bigboyooo blue1881eulb babygirlgtt yanhouzhen xiaoqingwang pylxtu little-fish-lalala blackmoresrainbow zpli0320 lijing8388 huaruidu shenfuli weichen12345 jun2hou terencecz ylhelloworld borayolo junhaoxue csliuchang bangguowei jasoncaojunjun

bdci_car_2018's Issues

关于 skmulti-learn

能介绍一下怎么安装 skmulti-learn 吗
一直在提示 from skmultilearn.problem_transform import BinaryRelevance, ClassifierChain, LabelPowerset
ModuleNotFoundError: No module named 'skmultilearn'

运行UNK的prepare_w2v的时候，会抛出assertionerror

使用UNK的prepare_w2v之后，再去训练会反应UNK的错误，重新运行分词得到词表词向量之后，再去运行则会得到assertionerror

如何在分类任务中使用预训练ELMo模型

您好，我现在已经下载了中文简体的ELMo模型，并且执行了github页面的一些基本操作。
但是我不太了解如何用pytorch把ELMo预训练模型用在后续分类任务中，我看了一些issue说可以参考word2vec的应用方法，但是word2vec是静态词向量，而且我还看到部分issue说目前HIT的ELMo无法和allennlp结合使用。

因此您方便提供一些Pytorch使用例子吗？感激不尽。

一台服务器三个GPU跑bert模型时，总会突然崩掉

我觉得可能是batch_size太大，但是调小也不管用。会不会是负载不均衡。有没有一些建议或解决方法。

embedding目录下的cc.zh.300.bin文件未找到

在用自己的数据来做预测的时候出现了UNK的情况，所以尝试运行prepare_w2v_with_UNK.py文件，但是报错找不到embedding下的cc.zh.300.bin文件，请问这一步要怎么解决呢？

../embedding/cc.zh.300.bin not found

尊敬的作者您好！

我最近在读您的代码，然后尝试代入自己语料跑一跑，但是在w2v时没有找到这个文件，想请问作者是用的什么做的w2v，谢谢啦！

bert_config.json 文件在什么位置

老哥您好想直接运行看下结果但是没有找到运行命令行中的bert_config.json文件

直接拿 BERT 做情感极性预测的准确率

Hello，我用 BERT 在 train data 上直接做情感的三分类，在 valid set （500 条）上的准确率大约在 75% 左右，此外也拿 SVM 和 Naive Bayes 做了一下 baseline，准确率在 73% 左右，想请教一下这样的效果是否合理?

请问为什么幻灯片中的最好的F1分数和官网排名中的不一样？

仓库的幻灯片里有模型改进折线图，可以看到最好的模型是0.7038。但是在 https://www.datafountain.cn/competitions/329/ranking 网站中，您的分数是 0.90084990。请问这是为什么呢？是除了幻灯片描述的方案之外，又发现了特别神奇的模型提升方法吗？希望能得到解答，麻烦您了。

关于模型的问题

看了bert的模型结构，发现模型的pipeline是：预测主题 => 预测主题极性（3分类）；但是主题预测错误在情感极性中会不断积累。
如果直接使用情感极性模型的结构，直接预测模型在所有主题上的情感极性（4分类， 3极性 + 1主题是否存在），这样的做法效果会更好么？
因为作者的模型结构中，Bert情感极性训练集直接使用的原始数据，而不是主题预测的结果下的情感极性数据，因此相关数据的precision，recall，f1好像并没有太大的意义。是否方便公布一下使用bert单模型在kflod valid上的实际效果。

RuntimeError: Error(s) in loading state_dict for BertModel:

执行：python run_classifier_ensemble.py --task_name Aspect --do_train --do_eval --do_lower_case --data_dir $GLUE_DIR/aspect_ensemble_online --vocab_file $BERT_BASE_DIR/vocab.txt --bert_config_file $BERT_BASE_DIR/bert_config.json --init_checkpoint $BERT_BASE_DIR/pytorch_model.bin --max_seq_length 128 --train_batch_size 24 --learning_rate 2e-5 --num_train_epochs 5 --output_dir $GLUE_DIR/aspect_ensemble_online --seed 42

报以下的错误，配置与环境都是没有问题的,请问作者不是用的谷歌官方的预训练模型转换为的pytorch_model.bin吗？

FOLDS: 0
12/05/2018 15:56:46 - INFO - main - device cuda n_gpu 8 distributed training False
12/05/2018 15:56:46 - INFO - main - LOOKING AT glue_data/aspect_ensemble_online/1/train.tsv
Traceback (most recent call last):
File "run_classifier_ensemble.py", line 887, in
main()
File "run_classifier_ensemble.py", line 636, in main
train(args)
File "run_classifier_ensemble.py", line 704, in train
model.bert.load_state_dict(torch.load(args.init_checkpoint, map_location='cpu'))
File "/home/zhuyuepeng01/.conda/envs/py3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 719, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for BertModel:
Missing key(s) in state_dict: "embeddings.word_embeddings.weight", "embeddings.position_embeddings.weight", "embeddings.token_type_embeddings.weight", "embeddings.LayerNorm.gamma", "embeddings.LayerNorm.beta", "encoder.layer.0.attention.self.query.weight", "encoder.layer.0.attention.self.query.bias", "encoder.layer.0.attention.self.key.weight", "encoder.layer.0.attention.self.key.bias", "encoder.layer.0.attention.self.value.weight", "encoder.layer.0.attention.self.value.bias", "encoder.layer.0.attention.output.dense.weight", "encoder.layer.0.attention.output.dense.bias", "encoder.layer.0.attention.output.LayerNorm.gamma", "encoder.layer.0.attention.output.LayerNorm.beta", "encoder.layer.0.intermediate.dense.weight", "encoder.layer.0.intermediate.dense.bias", "encoder.layer.0.output.dense.weight", "encoder.layer.0.output.dense.bias", "encoder.layer.0.output.LayerNorm.gamma", "encoder.layer.0.output.LayerNorm.beta", "encoder.layer.1.attention.self.query.weight", "encoder.layer.1.attention.self.query.bias", "encoder.layer.1.attention.self.key.weight", "encoder.layer.1.attention.self.key.bias", "encoder.layer.1.attention.self.value.weight", "encoder.layer.1.attention.self.value.bias", "encoder.layer.1.attention.output.dense.weight", "encoder.layer.1.attention.output.dense.bias", "encoder.layer.1.attention.output.LayerNorm.gamma", "encoder.layer.1.attention.output.LayerNorm.beta", "encoder.layer.1.intermediate.dense.weight", "encoder.layer.1.intermediate.dense.bias", "encoder.layer.1.output.dense.weight", "encoder.layer.1.output.dense.bias", "encoder.layer.1.output.LayerNorm.gamma", "encoder.layer.1.output.LayerNorm.beta", "encoder.layer.2.attention.self.query.weight", "encoder.layer.2.attention.self.query.bias", "encoder.layer.2.attention.self.key.weight", "encoder.layer.2.attention.self.key.bias", "encoder.layer.2.attention.self.value.weight", "encoder.layer.2.attention.self.value.bias", "encoder.layer.2.attention.output.dense.weight", "encoder.layer.2.attention.output.dense.bias", "encoder.layer.2.attention.output.LayerNorm.gamma", "encoder.layer.2.attention.output.LayerNorm.beta", "encoder.layer.2.intermediate.dense.weight", "encoder.layer.2.intermediate.dense.bias", "encoder.layer.2.output.dense.weight", "encoder.layer.2.output.dense.bias", "encoder.layer.2.output.LayerNorm.gamma", "encoder.layer.2.output.LayerNorm.beta", "encoder.layer.3.attention.self.query.weight", "encoder.layer.3.attention.self.query.bias", "encoder.layer.3.attention.self.key.weight", "encoder.layer.3.attention.self.key.bias", "encoder.layer.3.attention.self.value.weight", "encoder.layer.3.attention.self.value.bias", "encoder.layer.3.attention.output.dense.weight", "encoder.layer.3.attention.output.dense.bias", "encoder.layer.3.attention.output.LayerNorm.gamma", "encoder.layer.3.attention.output.LayerNorm.beta", "encoder.layer.3.intermediate.dense.weight", "encoder.layer.3.intermediate.dense.bias", "encoder.layer.3.output.dense.weight", "encoder.layer.3.output.dense.bias", "encoder.layer.3.output.LayerNorm.gamma", "encoder.layer.3.output.LayerNorm.beta", "encoder.layer.4.attention.self.query.weight", "encoder.layer.4.attention.self.query.bias", "encoder.layer.4.attention.self.key.weight", "encoder.layer.4.attention.self.key.bias", "encoder.layer.4.attention.self.value.weight", "encoder.layer.4.attention.self.value.bias", "encoder.layer.4.attention.output.dense.weight", "encoder.layer.4.attention.output.dense.bias", "encoder.layer.4.attention.output.LayerNorm.gamma", "encoder.layer.4.attention.output.LayerNorm.beta", "encoder.layer.4.intermediate.dense.weight", "encoder.layer.4.intermediate.dense.bias", "encoder.layer.4.output.dense.weight", "encoder.layer.4.output.dense.bias", "encoder.layer.4.output.LayerNorm.gamma", "encoder.layer.4.output.LayerNorm.beta", "encoder.layer.5.attention.self.query.weight", "encoder.layer.5.attention.self.query.bias", "encoder.layer.5.attention.self.key.weight", "encoder.layer.5.attention.self.key.bias", "encoder.layer.5.attention.self.value.weight", "encoder.layer.5.attention.self.value.bias", "encoder.layer.5.attention.output.dense.weight", "encoder.layer.5.attention.output.dense.bias", "encoder.layer.5.attention.output.LayerNorm.gamma", "encoder.layer.5.attention.output.LayerNorm.beta", "encoder.layer.5.intermediate.dense.weight", "encoder.layer.5.intermediate.dense.bias", "encoder.layer.5.output.dense.weight", "encoder.layer.5.output.dense.bias", "encoder.layer.5.output.LayerNorm.gamma", "encoder.layer.5.output.LayerNorm.beta", "encoder.layer.6.attention.self.query.weight", "encoder.layer.6.attention.self.query.bias", "encoder.layer.6.attention.self.key.weight", "encoder.layer.6.attention.self.key.bias", "encoder.layer.6.attention.self.value.weight", "encoder.layer.6.attention.self.value.bias", "encoder.layer.6.attention.output.dense.weight", "encoder.layer.6.attention.output.dense.bias", "encoder.layer.6.attention.output.LayerNorm.gamma", "encoder.layer.6.attention.output.LayerNorm.beta", "encoder.layer.6.intermediate.dense.weight", "encoder.layer.6.intermediate.dense.bias", "encoder.layer.6.output.dense.weight", "encoder.layer.6.output.dense.bias", "encoder.layer.6.output.LayerNorm.gamma", "encoder.layer.6.output.LayerNorm.beta", "encoder.layer.7.attention.self.query.weight", "encoder.layer.7.attention.self.query.bias", "encoder.layer.7.attention.self.key.weight", "encoder.layer.7.attention.self.key.bias", "encoder.layer.7.attention.self.value.weight", "encoder.layer.7.attention.self.value.bias", "encoder.layer.7.attention.output.dense.weight", "encoder.layer.7.attention.output.dense.bias", "encoder.layer.7.attention.output.LayerNorm.gamma",

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.