indiejoseph / cnn-text-classification-tf-chinese Goto Github PK

CNN for Chinese Text Classification in Tensorflow

Python 100.00%

text-classification convolutional-neural-networks tensorflow cnn deep-learning chinese nlp

cnn-text-classification-tf-chinese's Introduction

CNN for Chinese Text Classification in Tensorflow

Sentiment classification forked from dennybritz/cnn-text-classification-tf, make the data helper supports Chinese language and modified the embedding from word-level to character-level, though that increased vocabulary size, and also i've implemented the Character-Aware Neural Language Models network structure which CNN + Highway network to improve the performance, this version can achieve an accuracy of 98% with the Chinese corpus

This code belongs to the "Implementing a CNN for Text Classification in Tensorflow" blog post.

It is slightly simplified implementation of Kim's Convolutional Neural Networks for Sentence Classification paper in Tensorflow.

Requirements

Python 2.7
Tensorflow 0.9.0
Numpy

Running

Print parameters:

./train.py --help

optional arguments:
  -h, --help            show this help message and exit
  --embedding_dim EMBEDDING_DIM
                        Dimensionality of character embedding (default: 128)
  --filter_sizes FILTER_SIZES
                        Comma-separated filter sizes (default: '1,2,3,4,5,6,8')
  --num_filters NUM_FILTERS
                        Number of filters per filter size (default: '50,100,150,150,200,200,200')
  --l2_reg_lambda L2_REG_LAMBDA
                        L2 regularizaion lambda (default: 0.0)                        
  --dropout_keep_prob DROPOUT_KEEP_PROB
                        Dropout keep probability (default: 0.5)
  --batch_size BATCH_SIZE
                        Batch Size (default: 32)
  --num_epochs NUM_EPOCHS
                        Number of training epochs (default: 100)
  --evaluate_every EVALUATE_EVERY
                        Evaluate model on dev set after this many steps
                        (default: 100)
  --checkpoint_every CHECKPOINT_EVERY
                        Save model after this many steps (default: 100)
  --allow_soft_placement ALLOW_SOFT_PLACEMENT
                        Allow device soft device placement
  --noallow_soft_placement
  --log_device_placement LOG_DEVICE_PLACEMENT
                        Log placement of ops on devices
  --nolog_device_placement

Train:

./train.py

References

cnn-text-classification-tf-chinese's People

Contributors

Stargazers

Watchers

Forkers

realentertain matrixplayer dengwc asd51731 sonyfe25cp chao-jiang zhonghaoling benderpan haiyangma frankiegu sunxingxingtf penghts cash2one zhang-jinyi xspring14 winnerineast jiangyt2112 jiacanli shirleyyanggit coder3344 jammy112 cherish24 binkmust khronosplus ranjea shdut abhishekkodi derryd novellll wangtingc sevinjyolchuyeva nanfengpo fromradio xuerenlv benjamesbabala awenhu wxdublin papa11111 zorrock binbinbian ycsuperlife hualichenxi l6270789 babyhanhanhan tangbogreat lijielife chang-liu-0520 mikomou moherx guokeda oliverkehl yuxianzhi goodluckmrlee colinsongf geektown dpinthinker zhengguowei annht akalz melody-xiaomi lovingliferwj iamweiweishi m3744 lilitom stevenlol liumenglife nysyxxg eternalfeather yeahestherchan sxndqc yuyichen09 2016sun hustercn aiah yinboblue macanv hanayashiki levylll rachelhit nwth buhichan xinke0802 lgdkobe24 sabirdvd berryhn web199195 sanshe316 tobechao adangadang lanyu123 javajiang pengpage aigeorgeli hanyinong liangzai951 xiaojie2018 xrick snow19950625 richiesui superrichiesui

cnn-text-classification-tf-chinese's Issues

你好，可以提供对应的eval.py么？

你好，我参考train.py的参数和cnn-text-classification-tf的train.py,eval.py 尝试写了一个eval.py,但是测试的Accuracy只有0.503125。
刚接触tensorflow，而且在测试集不同时，最大sequence_length不同，模型shape对不上，想改成placeholder失败，后来用整个训练集作为测试，得出的准确率50.3%和训练最后的98%左右差异很大。
可以提供一份对应的eval.py么？

out of memory on GPU gtx970(4M Mem)

请问如果有多个句子怎么办呢？

您好，我看你写的是中文的文本分类，所以就用中文问了
这个demo好像做的是句子级别的分类，但是我想请问下，如果我有多个句子应该怎么办呢？
把channel数改成相应的句子数吗？相应的batch应该选择哪个呢？

希望您有空能回复下，感激不尽

可以提供兼容Python3.5的版本吗

刚入门，在Linux系统上安装驱动太麻烦，就在Windows上搞了，可是TensorFlow对Windows系统只支持python3.5版本的。。

where is the data?

this is totaly mis-report!

sorry!

你好这个中文情感分析数据是从哪里得到的？

谢谢

请问你的中文输入数据格式是什么样的？每行多个词，词之间空格间隔么？

InternalError: Dst tensor is not initialized.

提示内存不够，但是内存很明显够呀剩下100g呢

I tensorflow/core/common_runtime/bfc_allocator.cc:692] 1 Chunks of size 6652672 totalling 6.34MiB
I tensorflow/core/common_runtime/bfc_allocator.cc:696] Sum Total of in-use chunks: 42.12MiB
I tensorflow/core/common_runtime/bfc_allocator.cc:698] Stats: 
Limit:                    91684864
InUse:                    44171776
MaxInUse:                 48052736
NumAllocs:                     138
MaxAllocSize:              6652672

W tensorflow/core/common_runtime/bfc_allocator.cc:270] ******************xx**************************____******____________________________________________
W tensorflow/core/common_runtime/bfc_allocator.cc:271] Ran out of memory trying to allocate 44.94MiB.  See logs for memory state.
Traceback (most recent call last):

Name: tensorflow
Version: 0.11.0rc1
Location: /usr/local/lib/python2.7/dist-packages
Requires: mock, protobuf, numpy, wheel, six

请问想做多分类该怎么修改呢？

你好，

我最近在做中文文本多分类，tensorflow刚开始学，还不太懂，如果我没搞错这个是做二分类的，请问要做多分类的话怎么修改呢？谢谢！

请问这个代码在测试集上的准确率和召回率分别是多少呢

如题，谢谢！

你好，请问尝试过将数据分词后作为输入和用pretrained的词向量的做法吗？

如题，如尝试过，不知效果如何？是否会提高准召率？谢谢。

The model seems not converging?

The training accuracy is fluctuating between 55% -75% even after 800 steps.
I wonder in your case, how many steps did you take to make it converge to 98%?

There is no _linear function in TensorFlow 1.0.1

I managed to run the code in TF 1.0.1 using the conversion script. However, I still cannot found the _linear function in TF 1.0.1 in text_cnn.py line 14. Could you give some suggestions? Thanks.

直接运行时候, 报错

你好啊, 我这边直接 ./train.py 就报错了
会不会是版本不对? 0.9.0 >> 1.0.1

我的 tensorflow 版本为 1.0.1
python 为 2.7.13

Traceback (most recent call last): File "./train.py", line 76, in <module> l2_reg_lambda=FLAGS.l2_reg_lambda) File "/Users/edwardchan/projects/classification/cnn-text-classification-tf-chinese/text_cnn.py", line 75, in __init__ self.h_pool = tf.concat(3, pooled_outputs) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/ops/array_ops.py", line 1029, in concat dtype=dtypes.int32).get_shape( File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 637, in convert_to_tensor as_ref=False) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 702, in internal_convert_to_tensor ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/framework/constant_op.py", line 110, in _constant_tensor_conversion_function return constant(v, dtype=dtype, name=name) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/framework/constant_op.py", line 99, in constant tensor_util.make_tensor_proto(value, dtype=dtype, shape=shape, verify_shape=verify_shape)) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/framework/tensor_util.py", line 367, in make_tensor_proto _AssertCompatible(values, dtype) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/framework/tensor_util.py", line 302, in _AssertCompatible (dtype.name, repr(mismatch), type(mismatch).__name__)) TypeError: Expected int32, got list containing Tensors of type '_Message' instead.

你好，请教个问题。能解释下你的代码意思和主要做什么用的，分词？

你好。不好意思。初学tensorflow,copy了你的代码，在python 3.5.2上train了一下。

看了下准确率只有70%，是因为train时间的原因吗，正常要train多久？还有我看了下，这是个英文文章，你的cnn是做分词的还是什么？请稍微花点时间帮忙解答下。
然后我想实现，将一个照片上不同部分的defect识别出来，该用什么的tensorflow架构 cnn?有什么架构推荐下

indiejoseph / cnn-text-classification-tf-chinese Goto Github PK

cnn-text-classification-tf-chinese's Introduction

CNN for Chinese Text Classification in Tensorflow

Requirements

Running

References

cnn-text-classification-tf-chinese's People

Contributors

Stargazers

Watchers

Forkers

cnn-text-classification-tf-chinese's Issues

Recommend Projects

Recommend Topics

Recommend Org