Light

minghao-wu / crf-ae Goto Github PK

View Code? Open in Web Editor NEW

38.0 9.0 6.0 1.66 MB

Code for EMNLP 2018 paper https://arxiv.org/pdf/1808.09075.pdf

License: MIT License

Python 77.37% Perl 22.63%

crf-ae's Introduction

Neural-CRF-AE

Requirments

PyTorch 0.3.0
spaCy 2.0.0
Python 3.6

External Resources

Features are available at Google Drive

Gazetteers are available at Google Drive

Instructions

Clone this repo.
Create three new folders models, features and checkpoints.
Download pre-trained word embeddings to models and feature files to features.
Run python main.py and the model will be save at checkpoints

Acknowledgement

Some programs are adapted from:

Thank you for your contributions.

crf-ae's People

Contributors

Stargazers

Watchers

Forkers

chriszhangpodo cocoxu bealeson999 khikmatullaev casually-pylearner nofelmahmood

crf-ae's Issues

question about the reconstruction mechanism

Hi~
so you just reconstruct the POS tag on the exact position, say P(p_i | enc[w_i, p_i])? instead of predicating the next position POS, say P(p_{i+1} | enc[w_i, p_i])?
I am wondering why this kind of mechanism can help. because, in the exact position, the neural network just know the answer, right? so how can this enhance the performance.

question about the ablation study on POS

Hi~

I got two questions about the POS feature.

have you tested using only the POS as additional feature? (since table (3) shows the performance after removing POS, not adding). if yes, could you please share the number?
what do you mean by "Both features are based on the implementation of spaCy" in section (2.2), you re-annotated the POS by using spaCy? instead of just using the POS annotations from CoNLL2003?

IndexError: invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number

False
OrderedDict([('tag_scheme', 'iobes'), ('lower', True), ('zeros', False), ('char_dim', 25), ('char_lstm_dim', 25), ('char_bidirect', True), ('word_dim', 300), ('word_lstm_dim', 300), ('word_bidirect', True), ('all_emb', True), ('cap_dim', 0), ('crf', True), ('dropout', 0.5), ('reload', False), ('name', 'test'), ('char_mode', 'CNN'), ('train', 'dataset/eng.train'), ('dev', 'dataset/eng.testa'), ('test', 'dataset/eng.testb'), ('test_train', 'dataset/eng.train54019'), ('pre_emb', 'models/glove.6B.300d.txt'), ('use_gpu', False), ('features_dim', 196), ('feature_train', 'features/all_onehot.train'), ('feature_dev', 'features/all_onehot.testa'), ('feature_test', 'features/all_onehot.testb'), ('gazetter_dim', 3), ('gazetteer_train', 'features/gazetteer_PERLOC.train'), ('gazetteer_dev', 'features/gazetteer_PERLOC.testa'), ('gazetteer_test', 'features/gazetteer_PERLOC.testb'), ('pos_lambda', 1), ('wordshape_lambda', 1), ('gazetteer_lambda', 1), ('learning_rate', 0.015), ('epochs', 50)])
eval_script = ./evaluation/conlleval.pl
eval_temp = ./evaluation/temp
Found 7518 unique words (203621 in total)
Loading pretrained embeddings from models/glove.6B.300d.txt...
Found 85 unique characters
Found 19 unique named entity tags
14041 / 3250 / 3453 sentences in train / dev / test.
Loaded 400000 pretrained embeddings.
word_to_id: 400176
/Users/nofelmahmood/Documents/Master Thesis Topics/CRF-AE-master/utils.py:220: UserWarning: nn.init.uniform is now deprecated in favor of nn.init.uniform_.
nn.init.uniform(input_linear.weight, -bias, bias)
/Users/nofelmahmood/Documents/Master Thesis Topics/CRF-AE-master/utils.py:213: UserWarning: nn.init.uniform is now deprecated in favor of nn.init.uniform_.
nn.init.uniform(input_embedding, -bias, bias)
/Users/nofelmahmood/Documents/Master Thesis Topics/CRF-AE-master/utils.py:239: UserWarning: nn.init.uniform is now deprecated in favor of nn.init.uniform_.
nn.init.uniform(weight, -bias, bias)
/Users/nofelmahmood/Documents/Master Thesis Topics/CRF-AE-master/utils.py:242: UserWarning: nn.init.uniform is now deprecated in favor of nn.init.uniform_.
nn.init.uniform(weight, -bias, bias)
/Users/nofelmahmood/Documents/Master Thesis Topics/CRF-AE-master/utils.py:247: UserWarning: nn.init.uniform is now deprecated in favor of nn.init.uniform_.
nn.init.uniform(weight, -bias, bias)
/Users/nofelmahmood/Documents/Master Thesis Topics/CRF-AE-master/utils.py:250: UserWarning: nn.init.uniform is now deprecated in favor of nn.init.uniform_.
nn.init.uniform(weight, -bias, bias)

Epoch 1 / 50
0%| | 0/14041 [00:00<?, ?it/s]
Traceback (most recent call last):
File "main.py", line 319, in
train_model(model, data, optimizer, step_lr_scheduler, num_epochs=parameters["epochs"])
File "main.py", line 286, in train_model
loss += neg_log_likelihood.data[0] / len(data['words'])
IndexError: invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number

Where is mapping_file(models/mapping.pkl) used?

I am trying to understand how you map the glove word embeddings and create the vocabulary. In this regard, I have a question. Where is this mapping.pkl pickle dump used further in the codebase? I see no mention of it any further after you dump it to a file.

Adding more features to the auto encoder

Hi,

I see there are only 3 categories of features represented for the autoencoder.
Is it possible to add more categories of features and how would I possibly use both categorical and other non categorical ones?

Thanks

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.