Coder Social home page Coder Social logo

ner_and_re_pipeline's Introduction

A Pipeline for Entity Recognition and Relation Extraction Written in Pytorch

Main Requirements

  • python 3.6
  • pytorch 1.0
  • bioc 1.0
  • nltk 3.3
  • numpy 1.15
  • pandas 0.24

Models

Entity Recognition: BiLSTM-CRF

Relation Extraction: BiLSTM-Attention

Usage

  1. Train the pipeline
python main.py -whattodo 1 -config default.config -output ./output -train_dir ./sample -dev_dir ./sample
  • whattodo=1: train ner and re models
  • config: configuration file
  • output: directory of saved models
  • train_dir: directory of training data
  • dev_dir: directory of development data
  1. Extracting entities and relations using existing models
python main.py -whattodo 2 -config default.config -output ./output -input ./input -predict ./predict
  • whattodo=2: use existing models to extract entities and relations from raw text
  • config: configuration file
  • output: directory of saved models
  • input: directory of raw text
  • predict: directory of predicted results in the bioc-xml format
  1. Retraining the pipeline based on existing models
python main.py -whattodo 1 -config default.config -output ./output -pretrained_model_dir ./pretrained -train_dir ./sample -dev_dir ./sample
  • whattodo=1: train ner and re models
  • config: configuration file
  • output: directory of saved models
  • pretrained_model_dir: directory of pretrained models, which are the models trained in Usage 1.
  • train_dir: directory of training data
  • dev_dir: directory of development data

Acknowledgement

If you found the code is helpful, please cite:

@Article{info:doi/10.2196/12159,
author="Li, Fei and Liu, Weisong and Yu, Hong",
title="Extraction of Information Related to Adverse Drug Events from Electronic Health Record Notes: Design of an End-to-End Model Based on Deep Learning",
journal="JMIR Med Inform",
year="2018",
month="Nov",
day="26",
volume="6",
number="4",
pages="e12159",
issn="2291-9694",
doi="10.2196/12159",
url="http://medinform.jmir.org/2018/4/e12159/",
}

or

@article{li2017neural,
  title={A neural joint model for entity and relation extraction from biomedical text},
  author={Li, Fei and Zhang, Meishan and Fu, Guohong and Ji, Donghong},
  journal={BMC bioinformatics},
  volume={18},
  number={1},
  pages={198},
  year={2017},
  publisher={BioMed Central}
}

We mainly refered to the following work to write the code, so please also cite their work:

@inproceedings{yang2018ncrf,
 title={NCRF++: An Open-source Neural Sequence Labeling Toolkit},
 author={Yang, Jie and Zhang, Yue},
 booktitle={Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics},
 Url = {http://aclweb.org/anthology/P18-4013},
 year={2018}
}
@InProceedings{N18-1111,
  author = 	"Chen, Xilun
		and Cardie, Claire",
  title = 	"Multinomial Adversarial Networks for Multi-Domain Text Classification",
  booktitle = 	"Proceedings of the 2018 Conference of the North American Chapter of the      Association for Computational Linguistics: Human Language Technologies,      Volume 1 (Long Papers)    ",
  year = 	"2018",
  publisher = 	"Association for Computational Linguistics",
  pages = 	"1226--1240",
  location = 	"New Orleans, Louisiana",
  doi = 	"10.18653/v1/N18-1111",
  url = 	"http://aclweb.org/anthology/N18-1111"
}

ner_and_re_pipeline's People

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

ner_and_re_pipeline's Issues

bug求问

模型没改,data就用的sample下的示例,我想问下,出现了如下问题,求助一下。
image
文件bi'xu'shi必须是压缩文件吗,好奇怪

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.