Coder Social home page Coder Social logo

wet_lab_protocols's Introduction

Wet Lab Protocols

This repository contains the Conditional Random Fields (CRF) model for the shared task Entity and relation recognition over wet-lab protocols. This shared task was part of the 6th Workshop on Noisy User-generated Text held in 2020.

Task: Named Entity Recognition

The task involves extracting 18 entity classes from Wet Lab Protocols. The entity classes are described in An Annotated Corpus for Machine Reading of Instructions in Wet Lab Protocols by Kulkarni et al(2018). Have a look at Annotation Guidelines section of the paper for detailed description of the entity classes.

Example:

BRAT annotation

Sentence #6 shows the following entities:

Entity Text
Action Mix
Modifier thoroughly
Action pulse-spin
Device microfuge
BRAT styled annotated protocols

http://kb1.cse.ohio-state.edu:8010/index.xhtml#/wnut_20_data/

Entity, Action and Relation extraction demo

Data

How to run?

Below are the example commands.

  • Load protocol and entity annotations(if available)

    • Segments protocol into sentences and words.
    • Executes spaCy's NLP pipeline over the protocol.
      python -m src.dataset --ann_format standoff --protocol_id 101
  • Execute Conditional Random Fields (CRF) model

    • Train CRF model

        python -m src.crf
      • Saves model in ./output/models directory
    • Validate development set

        python -m src.crf --train_model ./output/models/model_standoff.pkl --evaluate_collection
    • Predict on test set

        python -m src.crf --train_model ./output/models/model_standoff.pkl --predict_collection

Publication

KaushikAcharya at WNUT 2020 Shared Task-1: Conditional Random Field(CRF) based Named Entity Recognition(NER) for Wet Lab Protocols (EMNLP | WNUT)

Citation

If you find this implementation helpful, please consider citing:

@inproceedings{acharya-2020-wnut,
    title = "{WNUT} 2020 Shared Task-1: Conditional Random Field({CRF}) based Named Entity Recognition({NER}) for Wet Lab Protocols",
    author = "Acharya, Kaushik",
    booktitle = "Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020)",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.wnut-1.37",
    pages = "286--289",
    abstract = "The paper describes how classifier model built using Conditional Random Field detects named entities in wet lab protocols.",
}

Related Work

wet_lab_protocols's People

Contributors

kaushikacharya avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.