Coder Social home page Coder Social logo

jancio / extracting-chemical-disease-associations-from-biomedical-literature Goto Github PK

View Code? Open in Web Editor NEW
1.0 2.0 0.0 882 KB

Analysis of associations between chemical and disease co-mentions in biomedical literature

Jupyter Notebook 99.61% Shell 0.39%

extracting-chemical-disease-associations-from-biomedical-literature's Introduction

Extracting Chemical-Disease Associations From Biomedical Literature

In this project I analyse the strength of associations between chemical and disease co-mentions in the biomedical literature. First, I tune and train the named entity recognition (NER) model based on Conditional Random Fields to tag named entities (NEs) in text as chemicals or diseases. Next, using approximate string marching I build a system to ground NEs to Medical Subject Headings (MeSH) concepts. Lastly, the NER model and the grounding system are applied to the corpus of PubMed abstracts on chemically-induced disorders. I then analyse the occurrences of chemical-disease pairs (CDPs) using various co-occurrence measures and also investigate the similarity between the rankings of CDPs produced by these measures. I further classify the CDPs according to the type of the relation between the chemical and disease, consulting with the physician MUDr Maria Kleinova.

My results show that the chemical ”levodopa” and the disease ”abnormal movements” co-occur most fre quently according to Symmetric Conditional Probability (SCP) and Dice Coefficient (DC) measures and for each co-occurrence measure this CDP also appears in the top 10 ranked CDPs. My investigation further indicates that the Simple Co-occurrence Count (SCC) is unlikely to be useful for discovering new chemical-disease associations whereas the Normalised Point-wise Mutual Information (NPMI) is promising for this task. Also, the ranking of CDPs by SCC measure is most dissimilar to rankings by other measures. Regarding the type of the relation between chemicals and diseases, the SCC measure seems to be best suited for identification of CDPs where the chemical induces the disease, while the NPMI measure for extraction of CDPs with not very well known or possibly unknown relations.

extracting-chemical-disease-associations-from-biomedical-literature's People

Contributors

jancio avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.