Coder Social home page Coder Social logo

suamin / t2ner Goto Github PK

View Code? Open in Web Editor NEW
11.0 2.0 5.0 7.36 MB

T2NER: Transformers based Transfer Learning Framework for Named Entity Recognition (EACL 2021)

Home Page: https://www.aclweb.org/anthology/2021.eacl-demos.25.pdf

License: MIT License

Python 100.00%
transformers ner transfer-learning domain-adaptation multi-task-learning unsupervised-domain-adaptation bert eacl

t2ner's Introduction

T2NER

A transformers based transfer learning framework for named entity recognition (NER).

Instructions

Clone the repository and run the requirements file:

git clone https://github.com/suamin/t2ner.git
cd t2ner
pip install -r requirements

Preprocessing

Download the NER data of interest and convert it into CoNLL format. Example datasets are provided in data folder (GermEval 2014, CoNLL-2002). Then, preprocess the CoNLL formatted data:

python t2ner/preprocess.py \
    --data_dir data/ner \
    --output_dir data/processed \
    --model_name_or_path bert-base-multilingual-cased \
    --model_type bert \
    --max_len 128 \
    --overwrite_output_dir \
    --languages es,nl

Experiments

To run an experiment:

python t2ner/run.py \
    --exp_type ner \
    --base_json configs/base.json \
    --exp_json configs/ner.json

Citation

If you find our framework useful, please consider citing:

@inproceedings{amin-neumann-2021-t2ner,
    title = "{T}2{NER}: Transformers based Transfer Learning Framework for Named Entity Recognition",
    author = "Amin, Saadullah and Neumann, G{\"u}nter",
    booktitle = "Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations",
    month = apr,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.eacl-demos.25",
    doi = "10.18653/v1/2021.eacl-demos.25",
    pages = "212--220"
}

Also, check our follow-up work using T2NER for few-shot cross-lingual de-identification of clinical texts:

@inproceedings{amin-etal-2022-shot,
    title = "Few-Shot Cross-lingual Transfer for Coarse-grained De-identification of Code-Mixed Clinical Texts",
    author = "Amin, Saadullah and Pokaratsiri Goldstein, Noon and Kelly Wixted, Morgan and Garcia-Rudolph, Alejandro and Mart{\'\i}nez-Costa, Catalina and Neumann, G{\"u}nter",
    booktitle = "Proceedings of the 21st Workshop on Biomedical Language Processing",
    month = may,
    year = "2022",
    address = "Dublin, Ireland",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.bionlp-1.20",
    doi = "10.18653/v1/2022.bionlp-1.20",
    pages = "200--211"
}

Acknowledgements

The algorithmic components of the framework largely follow Transfer-Learning-Library and Dassl.pytorch, if you find T2NER useful, please also consider citing these works.

t2ner's People

Contributors

suamin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.