Coder Social home page Coder Social logo

patelrajnath / rnn4nlp Goto Github PK

View Code? Open in Web Editor NEW
8.0 3.0 0.0 888 KB

This contains RNN based word level quality estimation, and Part-of-Speech-Tagger

Home Page: http://kbcs.in/

License: GNU General Public License v3.0

Python 98.46% Shell 1.54%
quality-estimation pos-tagger deep-learning natural-language-processing recurrent-neural-networks long-short-term-memory-models

rnn4nlp's Introduction

This project is not maintained anymore. There is a pytorch based implimentation for this repository is available at- https://github.com/patelrajnath/dl4nlp-py

Recurrent Neural Networks for Natural Language Processing (rnn4nlp)

This repository contains:

(1) RNN based system for word level quality estimation.

(2) RNN based Part-of-Speech tagger for code-mixed social media text.

This includes the implementation of various RNN models including simple Recurrent Neural Network, Long-Short Term Memory (LSTM), DeepLSTM, and Gated Recurrent Units (GRU) aka Gated Hidden Units (GHU). The system is flexible to be used for any word level NLP tagging task like Named Entity Recognition etc.

Pre-requisites

Quick Start

Quality estimation with toy data:

Create the vocab for training-

WORD INPUT
$python utils/build_dictionary.py data/qe/train/train.src.lc 0
$python utils/build_dictionary.py data/qe/train/train.mt.lc 0

CHARACTER INPUT (--use_char switch)
$python utils/build_char_dictionary.py data/qe/train/train.src.lc
$python utils/build_char_dictionary.py data/qe/train/train.mt.lc


LABELS
$python utils/build_dictionary.py data/qe/train/train.tags 1

Note: --use_char switch is available only with GRU model

And then run the training script-

$bash train-qe.sh

Testing with new test-set-

$bash test-qe.sh

Note: You can specifiy any text for testing but dictionaries and label2index should be the same as used at training time

Part-of-Speech tagging with toy data:

Create vocab for training-

WORD INPUT
$python utils/build_dictionary.py data/pos/hi-en.train.txt 0

CHARACTER INPUT (-use_char switch)
$python utils/build_char_dictionary.py data/pos/hi-en.train.txt

LABELS
$python utils/build_dictionary.py data/pos/hi-en.train.tags 1

And then run the training script-

$bash train-tag.sh

Testing with new test-set-

$bash test-tag.sh

Note: You can specifiy any text for testing but dictionaries and label2index should be the same as used at training time

Detailed Description

For detailed description visit the wiki page- https://github.com/patelrajnath/rnn4nlp/wiki

Publications:

If you use this project, please cite the following papers:

@InProceedings{patel-m:2016:WMT, author = {Patel, Raj Nath and M, Sasikumar}, title = {Translation Quality Estimation using Recurrent Neural Network}, booktitle = {Proceedings of the First Conference on Machine Translation}, month = {August}, year = {2016}, address = {Berlin, Germany}, publisher = {Association for Computational Linguistics}, pages = {819--824}, url = {http://www.statmt.org/wmt16/pdf/W16-2389.pdf } }

@article{patel2016recurrent, title={Recurrent Neural Network based Part-of-Speech Tagger for Code-Mixed Social Media Text}, author={Patel, Raj Nath and Pimpale, Prakash B and Sasikumat, M}, journal={arXiv preprint arXiv:1611.04989}, year={2016} url = {https://arxiv.org/pdf/1611.04989.pdf } }

Author

Raj Nath Patel ([email protected])

Linkedin: https://www.linkedin.com/in/raj-nath-patel-2262b024/

Version

0.1

LICENSE

Copyright Raj Nath Patel 2017 - present

rnn4nlp is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

You should have received a copy of the GNU General Public License along with Indic NLP Library. If not, see http://www.gnu.org/licenses/.

rnn4nlp's People

Contributors

patelrajnath avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.