thunlp / attribute_charge Goto Github PK

View Code? Open in Web Editor NEW

126.0 10.0 29.0 17 KB

The source code of our COLING'18 paper "Few-Shot Charge Prediction with Discriminative Legal Attributes".

Python 100.00%

legal-ai

attribute_charge's Introduction

Few-Shot Charge Prediction with Discriminative Legal Attributes

Source code and datasets of COLING2018 paper: "Few-Shot Charge Prediction with Discriminative Legal Attributes". (pdf)

Dataset

Please download the dataset here, unzip it and you will get three folders: "data", "data_20w", "data_38w". Then put the folder "data" under this directory. It contains following files:

words.vec: Pre-trained word embeddings, each line contains a word and its embedding.
attributes: The legal attributes for each charge.
train: data for training from small dataset.
test: data for test from small dataset.
valid: data for validation from small dataset.

If you want to train and test on middle dataset, please copy the files in "data_20w" folder to "data" folder. If you want to train and test on large dataset, please copy the files in "data_38w" folder to "data" folder.

Run

Run the following command for training our model:

cd code/
python train.py

Dependencies

Tensorflow == 0.12
Scipy == 0.18.1
Numpy == 1.11.2
Python == 2.7

Log

After start training, a new folder "log" will be created.There are 4 directories in it:

/evaluation_charge_log/: stores model's performance of charge prediction on test data during training.
/evaluation_attr_log/: stores model's performance of attribute prediction on test data during training.
/validation_charge_log/: stores model's performance of charge prediction on validation data during training.
/validation_attr_log/: stores model's performance of attribute prediction on validation data during training.

Cite

If you use the code, please cite this paper:

Zikun Hu, Xiang Li, Cunchao Tu, Zhiyuan Liu, Maosong Sun. Few-Shot Charge Prediction with Discriminative Legal Attributes. The 27th Iinternational Conference on Computational Liguisitics (COLING 2018).

For more related works, please refer to my homepage.

attribute_charge's People

Contributors

Stargazers

Watchers

attribute_charge's Issues

关于代码的一些疑惑与结果复现

代码疑惑1
model.py中，125行的unmasked_attr_loss的shape是[batch_size,]，126行的attr_mask的shape是[batch_size, 1]，两者经过tf.multiply后的attr_loss的shape就会是[batch_size, batch_size]。是不是应该先把attr_mask reshape成[batch_size,]？
代码疑惑2
model.py中，71行是否应该用tf.reduce_mean而非tf.reduce_sum？
结果复现
这是我复现的结果：Acc: 93.4; MP: 56.9; MR: 57.7; F1: 55.6。F1值没有达到论文的64.9，想知道这可能是什么原因造成的？谢谢！

bug for init words.vec

./attribute_charge/code/init.py", line 52, in
content = [(float)(i) for i in content]
ValueError: could not convert string to float: '炜'

请问这个代码现如今还能运行吗

运行时出现错误，请教这是怎么回事

Traceback (most recent call last):
File "train.py", line 368, in
tf.app.run()
File "/home/lx/anaconda2/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "train.py", line 83, in main
lstm_model = model.LSTM_MODEL(word_embeddings=word_embeddings,attr_table=attr_table,config = lstm_config)
File "/home/lx/attribute_charge-master/code/model.py", line 50, in init
self.attn_weights = tf.concat(1, [tf.expand_dims(temp,1) for temp in self.attention_weights])
File "/home/lx/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/array_ops.py", line 1122, in concat
tensor_shape.scalar())
File "/home/lx/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/tensor_shape.py", line 848, in assert_is_compatible_with
raise ValueError("Shapes %s and %s are incompatible" % (self, other))
ValueError: Shapes (10, 32, 1, 500) and () are incompatible