Coder Social home page Coder Social logo

athan-dial / pymeshsim Goto Github PK

View Code? Open in Web Editor NEW

This project forked from luozhhub/pymeshsim

0.0 0.0 0.0 19.72 MB

a module to calculate the MeSH similarity

License: GNU General Public License v3.0

Python 10.01% HTML 82.72% CSS 1.37% JavaScript 5.90%

pymeshsim's Introduction

Introduction

More details can be seen in the reference.

cite:Luo, Z., Shi, M., Yang, Z. et al. pyMeSHSim: an integrative python package for biomedical named entity recognition, normalization, and comparison of MeSH terms. BMC Bioinformatics 21, 252 (2020). https://doi.org/10.1186/s12859-020-03583-6

pyMeSHSim at glance

Biomedical named entity (Bio-NE) recognition, normalization, and comparison

The recognition and normalization of bio-NE, especially for diseases, play an important role in clinical and biomedical research, such as clinical decision support, cohort identification, pharmacovigilance, and drug repositioning. For example, bio-NE recognition and normalization are prerequisites for semantic analysis, including semantic comparison of bio-NEs in drug repositioning. However, there are multiple synonyms, abbreviations and variations for bio-NEs, making it challenging to curate bio-NEs from free biomedical text or clinical narrative text.

MeSH

We extracted bio-NEs from free biomedical text and measured semantic similarity between the bio-NEs based on the Medical Subject Headings(MeSH).

MeSH is a medical vocabulary resource curated by the National Library of Medicine (NLM). It provides a hierarchically-organized terminology for indexing and cataloging of biomedical information in MEDLINE/PubMed and other NLM databases. Moreover, MeSH is organized as a directed acyclic graph, laying the foundation for computing semantic similarities between two concepts.

Although MeSH has potential for bio-NE recognition, normalization, and comparison , there is still a lack of MeSH tools to automatically recognize bio-NEs from free text and measure the semantic similarity between bio-NEs after normalization.

pyMeSHSim

Here, we developed an integrative, lightweight and data-rich python package named pyMeSHSim to curate MeSH terms from free text and measure the semantic similarity between the MeSH terms.

Currently, pyMeSHSim consists of three subpackages:

  • data subpackage
    • The data subpackage has reorganized the MeSH information in bcolz format.
    • It is lightweight and data-rich.
    • It contained the main heading concepts, unique DescriptorUI, MeSH Tree code, and correspond UMLS ID.
    • It contained all narrow concepts of the main heading concepts. It reserved the parent-child relationships and RN/RB relationships for all concepts.
  • metamapWrap subpackage
    • It provided some filter rules for parsing the free text.
    • It provided a unified interface to create the MeSH concept objects.
  • Sim subpackage
    • It provided useful APIs to retrieve the MeSH dataset.
    • It implemented four methods of semantic similarity measures based on information content.It implemented one method of semantic similarity measures based on path.

More details can be seen in the reference.

Installation

Requirements

  • Software

    • MetaMap 2016v2
  • Python packages

    • python 3.6
    • pandas
    • bcolz>=1.2.1

MetaMap installation

MetaMap is the base implement of the subpackage metamapWrap.
You need to activate a UMLS Terminology Services (UTS) account to fetch MetaMap. Please see MetaMap for more information.

MetaMap depends on Java. To install openJDK:

$ sudo apt install default-jdk

After downloading MetaMap and Extracting it, you can install it by:

$ cd ./public_mm/
$ bash ./bin/install.sh

At this point, you have successfully installed MetaMap.

Please add the bin directory to the environment variable PATH in bashrc for convenience:

$ export PATH=$PATH:/path_to_MetaMap/bin

Then, launch MetaMap server before running pyMeSHSim:

$ skrmedpostctl start
$ wsdserverctl start

Installation of pandas and bcolz

To install python package pandas and bcolz:

$ pip3 install pandas
$ pip3 install bcolz

Installation of pyMeSHSim

To install pyMeSHSim from source code:

$ git clone https://github.com/luozhhub/pyMeSHSim.git
$ cd pyMeSHSim
$ python3 ./setup.py install

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.