The named_entity_recognition_healthcare_data from a-yush

a-yush / named_entity_recognition_healthcare_data Goto Github PK

View Code? Open in Web Editor NEW

Named Entity Recognition in Healthcare data to identify possible diseases and their suggested treatments from a corpus of medical text containing both disease and treatment.

License: MIT License

Jupyter Notebook 100.00%

named_entity_recognition_healthcare_data's Introduction

PROBLEM STATEMENT

"BeHealthy" is a health-tech company and has a web platform that allows doctors to list their services and manage patient interactions and provides services for patients such as booking interactions with doctors and ordering medicines online. Here, doctors can easily organise appointments, track past medical records and provide e-prescriptions. So, companies like "BeHealthy" are providing medical services, prescriptions and online consultations and generating huge data day by day.

The following snippet of medical data may be generated when a doctor is writing notes to his/her patient or as a review of a therapy that he or she has done. "The patient was a 62-year-old man with squamous cell lung cancer, which was first successfully treated by a combination of radiation therapy and chemotherapy."

As we can see in this text, a person with a non-medical background cannot understand the various medical terms. We have taken a simple sentence from a medical data set to understand the problem and where one can understand the terms 'cancer' and 'chemotherapy'.

We have been given such a data set in which a lot of text is written related to the medical domain. There are a lot of diseases that can be mentioned in the entire dataset and their related treatments are also mentioned implicitly in the text, which we saw in the aforementioned example that the disease mentioned is cancer and its treatment can be identified as chemotherapy using the sentence. However, note that it is not explicitly mentioned in the dataset about the diseases and their treatment, but we can build an algorithm to map the diseases and their respective treatment.

The train dataset is used to train the Continuous Random Field (CRF) model, and the test dataset is used to evaluate the built model.

Recommend Projects

a-yush / named_entity_recognition_healthcare_data Goto Github PK

named_entity_recognition_healthcare_data's Introduction

named_entity_recognition_healthcare_data's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent