andyweizhao / capsule_text_classification Goto Github PK

View Code? Open in Web Editor NEW

341.0 341.0 86.0 62 KB

Python 100.00%

capsule_text_classification's People

Contributors

Stargazers

Watchers

Forkers

zxz53000 mengdebin18 hehuihui1994 longweibing stevenlol yuyuvenus dongcin mr-jxd fendaq hardikgw coderbyr penghts swordsmanxyz chunningdu huihui-song danielhers lcy081099 shaform earlzz iamsile pku-wuwei zhoudayang mio-mio-mio qshuang123 hhh920406 alicebupt shikaize frankey419 raman1121 xiaobingdu gkoumasd deepphysicvision peterboyyy charlottesean warisqr007 mooler aryancodify sun-yitao jayden-zdd queenie88 zsweet hoangcuong2011 anish52 jingmu123 lvcheer nstats senchfu michael-wzhu sysujayce long2730500 samiraabnar navpreetsamra siat-nlp fishewyz june1819 luxrck colinsongf andreixyz nanciwan qsong4 imchenmin lumelon endlessxy guo-huojian yushuai bill007bill anirudhprabhu strategist922 akkarimi piyumalanthony tmaczouy zlpmichelle yahiko-l milkwyx guojson tommy-xu 20161105421 chensiyue98 aychaalsattouf pengpanpan neil-98 sxxy123 zhxi1998 zxingstar

capsule_text_classification's Issues

The format of the input data

Would please tell us what is the format of the input data, i.e., how to use your code on user's own data? Thank you very much.

代码问题

您好，
请问您的这部分什么意思呢，hk_offsets和wk_offsets分别代表什么呢，谢谢~

main.py: error: unrecognized arguments: -- model_type CNN --learning_rate 0.0005

python ./main.py --model_type CNN --learning_rate 0.0005

python ./main.py --model_type capsule-A --learning_rate 0.001

后来发现 --model_type 前面多了空格。

Please list your requirements in a working environment by running pip freeze. I tried with the following but I'm getting theano.tensor.var.AsTensorError: ('Cannot convert Tensor("capsule_3/primary/Reshape:0", shape=(25, 99, 1, 16, 16), dtype=float32) to TensorType', <class 'tensorflow.python.framework.ops.Tensor'>):

absl-py==0.6.0
astor==0.7.1
backports.weakref==1.0.post1
bleach==1.5.0
enum34==1.1.6
funcsigs==1.0.2
futures==3.2.0
gast==0.2.0
grpcio==1.16.0
h5py==2.8.0
html5lib==0.9999999
Keras==2.2.4
Keras-Applications==1.0.6
Keras-Preprocessing==1.0.5
Markdown==3.0.1
mock==2.0.0
numpy==1.15.3
pbr==5.1.0
protobuf==3.6.1
PyYAML==3.13
scikit-learn==0.17.1
scipy==1.1.0
six==1.11.0
tensorboard==1.11.0
tensorflow==1.4.1
tensorflow-tensorboard==0.4.0
termcolor==1.1.0
Theano==0.8.0
Werkzeug==0.14.1

got an error

Thank you for sharing, but when I run your code, there is an error: ValueError: Dimensions must be equal, but are 84840 and 16 for 'capsule_3/conv2/add_2' (op: 'Add') with input shapes : [84840,16,48], [84840,16,16,16,48]. I changed the input data, but nothing else changed. Could you give me some suggestions?

The loss don't change

Hello, after I change the weight_sharing from true to false. the loss don't change until 100 iterations. Then the model work properly, but the decaying rapid of the loss get quite slow. Can you give me some suggestions? I believe the key issues lays in the Squash function. but i don't know how to amend it.
Thank you!

How do you split the dataset in the paper?

Can you detailly explain your split method of all datasets in the paper, or provide them? Some dataset don't have test-set or validation-set.

代码一些参数和论文对不上

对于capsule A 模型
1 N-gram convolutional Layer 卷积步长论文说是1 但是代码是2
2 primary layer 关于C的值，代码中是32 但是我的输出为什么是prim poses dimension:(25, 99, 1, 16, 16)
C是16吗？

next update

may I ask you when the experiment will update,please

"Coefficients Amendmen" strategy isn't implement in code

Hi,@andyweizhao :
I found that "Coefficients Amendmen" strategy isn't implement in code.It have been commented out。

"Coefficients Amendmen" strategy can't be improve the performance, so commented out it, isn't it?

What is leaky-softmax

hello, I have read your paper, but I do not understand leaky-softmax.
Can you give me equation, thanks !

the dataset is valid!

your dataset link is invalid,can you fix it and give the dataset link?

hd5 file for other datasets， such as MR

Could you please provide hdf5 file for other datasets? I find it needs large memory to get hdf5 file for MR dataset.

For a single label task, how do you handle the output of the model ?

您好，在您的代码中在多标签数据上实验，模型输出的胶囊向量模大于0.5的标签设置为1，这样的设置很显然不适用于单标签的任务，我想知道对于单标签任务您是如何设置输出的？非常期待您的回答，感谢！

Orphan Category

在一些任务上，直接跑capsnet，相比于textcnn效果会差一些，考虑到background-noise的影响，您提出了3种策略，包括Orphan Category，Leaky-Softmax，和Coefficients Amendment。代码中好像只有Coefficients Amendment部分代码。请问其他两种方法的代码您还会更新上来吗？

capsule-B F1 85.8?

In the paper, the capsule-B F1 score on Reuters-Multilabel data set is 85.8, but the best score I can get is 83.7

python ./main.py -- model_type capsule-A --learning_rate 0.001

Epoch: 2 Val accuracy: 89.9% Loss: 0.0612
ER: 0.095 Precision: 0.635 Recall: 0.575 F1: 0.566
Epoch: 3 Val accuracy: 93.3% Loss: 0.0391
ER: 0.594 Precision: 0.912 Recall: 0.770 F1: 0.816
Epoch: 4 Val accuracy: 94.7% Loss: 0.0326
ER: 0.615 Precision: 0.939 Recall: 0.788 F1: 0.837
Epoch: 5 Val accuracy: 95.8% Loss: 0.0299
ER: 0.428 Precision: 0.948 Recall: 0.692 F1: 0.777
Epoch: 6 Val accuracy: 96.0% Loss: 0.0272
ER: 0.348 Precision: 0.958 Recall: 0.661 F1: 0.759

About CapsuleConv, FullyConnected layers.

Hello,
I was interested to read the paper.
I would like to clarify the following.
Even though
vec_transformationByMat supposed to be used in the layers according to the paper, in the code vec_transformationByConv is applied
instead. It seems that vec_trandformationByMat is never used in the code.
Thanks in advance.

为什么我得不到你论文里面的结果？

为什么我得不到你论文里面的结果？而且最终结果一直不会收敛，你是取得最大值作为最终结果吗？或者是你修改了一些超参数，可以交流一下吗？谢谢

py-2 to py-3, int(num_classes)

capsule_text_classification/network.py

Line 18 in f62ba4b

    
           activations = tf.sigmoid(slim.fully_connected(nets, num_classes, scope='final_layer', activation_fn=None))

layer.py 101行的激活函数怎么理解？

感谢您的分享，我在学习代码时有一处不理解，如下：
beta_a = _get_weights_wrapper(
name='beta_a', shape=[1, shape[-1]]
)
activations = K.sqrt(K.sum(K.square(poses), axis=-1)) + beta_a
我理解activations指的是vector的强度，那beta_a是一个随机生成的变量，为什么要加在activations中呢？
还望请您有时间指点一下~

About MR dataset

when I run your code in MR dataset, I found can't get results as your paper.please tell how you set the experiment and how to process the original data

ValueError: num_outputs should be int or long, got 9.

Hello
i need help

Traceback (most recent call last):
File "./main.py", line 167, in
poses, activations = baseline_model_cnn(X_embedding, args.num_classes)
File "D:\PycharmProjects\pythonProject\capsule_text_classification-master\network.py", line 18, in baseline_model_cnn
activations = tf.sigmoid(slim.fully_connected(nets, num_classes, scope='final_layer', activation_fn=None))
File "D:\Anaconda\envs\py27\lib\site-packages\tensorflow\contrib\framework\python\ops\arg_scope.py", line 183, in func_with_args
return func(*args, **current_args)
File "D:\Anaconda\envs\py27\lib\site-packages\tensorflow\contrib\layers\python\layers\layers.py", line 1822, in fully_connected
(num_outputs,))
ValueError: num_outputs should be int or long, got 9.

数据集

你好，其他的数据集可以共享下吗

文本分类相关问题

您好，我发现了一些问，运行您的代码的时候出现了一些，发现维度的错误（你的代码模型部分我都没有改变）在routing 部分的b=b+K.batch_dot(outputs,u_hat_vecs,[2,3])计算它的耦合系数的时候，其中output维度是[224,16,16,16],u_hat_ves的维度是[226,16,48,16]报错是： Dimensions must be equal, but are 224 and 16 for 'capsule_3/conv2/add_2' (op: 'Add') with input shapes: [224,16,48], [224,16,16,16,48]，第二个是：胶囊网络的动态路由迭代是一个迭代过程，您好像没有进行反向传播截断，这个地方是否需要进行反向传播截断.
我使用tf的矩阵相乘[224,16,16,16] x[224,16,16,48]得到[224,16,16,48] 我然后在第2个维度上进行相加变成[224,16,48]与加b，解决了routing 部分的问题，然后后面的poses的reshape 又出现了相关问题,
我使用的环境是python3.6和tf.1.14.0，这个应该会是环境配置问题吧，像请问下相关的问题，想拿你的模型做一个baseline 模型，作为引文，你的代码是应该没有写笔误吧，还是python3.6和python2.的问题

数据在哪里获得

请问数据在哪里

pytorch实现的代码

你好想问一下有没有用pytorch实现的代码？