Coder Social home page Coder Social logo

healthlearning's Introduction

Health Learning: ML and Deep Learning for Healthcare 🩺🧠

Health Learning is an open-source project aimed at leveraging machine learning (ML) and deep learning techniques to address various healthcare challenges. By harnessing the power of data-driven approaches, our goal is to develop predictive models, diagnostic tools, and decision support systems to improve patient outcomes, optimize healthcare delivery, and advance medical research.

Motivation πŸš€

The field of healthcare is ripe for innovation, with vast amounts of data available from diverse sources such as electronic health records, medical imaging, wearable devices, and genetic sequencing. Health Learning seeks to harness this wealth of data to tackle a wide range of healthcare issues, including disease prediction, diagnosis, treatment optimization, and personalized medicine. By democratizing access to healthcare data and cutting-edge machine learning algorithms, we aim to empower researchers, clinicians, and healthcare professionals to make data-driven decisions and drive innovation in healthcare.

Datasets πŸ“Š

Health Learning provides access to a curated collection of healthcare datasets sourced from various sources, including public repositories like Kaggle. These datasets cover a broad spectrum of health-related topics, including maternal health, diabetes classification, cardiovascular disease risk factors, stroke prediction, cancer imaging, and more. Researchers and developers can explore these datasets to develop and validate machine learning models for a wide range of healthcare applications. Individual projects have their datasets mentioned in respective README.md files.

Contributing 🀝

Health Learning welcomes contributions from researchers, developers, healthcare professionals, and enthusiasts passionate about using machine learning and deep learning for healthcare. Whether you're interested in developing new models, improving existing algorithms, or curating datasets, there are plenty of opportunities to get involved. Check out our Contribution Guidelines to learn how you can contribute to the project.


Note: Health Learning is a community-driven initiative and is not affiliated with any specific healthcare organization or institution. We strive to promote collaboration, transparency, and open exchange of knowledge for the betterment of healthcare worldwide. Join us in our mission to revolutionize healthcare through machine learning and deep learning! πŸŒπŸ’‘

healthlearning's People

Contributors

aditi1807 avatar anuragsarkar12 avatar arihant-bhandari avatar asymtode712 avatar avii-07 avatar basma2423 avatar dakshsinghrathore avatar deedghost avatar dharanilakkireddy avatar disha-16 avatar divyanshi1002 avatar jain-anshika avatar mehekfatima avatar officeneerajsaini avatar piyushseth55 avatar pradnyagaitonde avatar rutikaw1155 avatar saksh8 avatar sanjanabankar avatar sanketv010 avatar srijanshovit avatar suhanipaliwal avatar theiturhs avatar vedanshipathak avatar yusuf-khaan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

healthlearning's Issues

Metabolic Syndrome Prediction | 1. Dataset Exploration

HI, im one of the contributors under GSSoC'24 and wanted to work on the stated problem statement. i would be working on basic Exploratory Data Analysis as well as modelling : i would be implementing traditional as well as gradient boosting algorithms, submitting a report on metrics as well as commenting information and segmenting sections in the work file.

i hope i can be assigned this.

Stroke Prediction

I would like to use a few classification models like decision tree, xgboost, random forest for stroke detection. Please assign me the issue. I'm a GSSoC'24 Contributor

Maternal Health Risk prediction

Hi, I am Disha Mukhopadhyay, currently persuing BTech in Computer Science and Engineering (CSE). I have done an internship as an AI intern. and published an IEEE paper in Machine Learning domain, and also have some papers in proceeding in Machine Learning and data science domain. It will be very helpful if you could assign me this project for GSSOC'24 as a contributor, so that I can work on this project.
Approach for this Project :
1.Data collection and preprocessing: In this section, we will collect and gather the dataset, and preprocess it
2.Exploratory Data Analysis(EDA): Visual inspection, statistical summary, data distribution will be performed.
3.Model Selection and Deployment: 4 types of machine learning models(XGBoost, Logistic regression Random Forest, gaussian naive Bayes) will be chosen and implemented.
4.Model Training and evaluation: Each model will be trained on the dataset, and performance of each model will be displayed.
5.Model Comparison and Selection: Will analyze the performance of all models based on the metrics obtained and will Choose the model that shows the best balance between accuracy, generalizability, and computational efficiency.
Thank You

Ayurveda GPT

Problem Description:
The problem is to develop an Ayurveda GPT (Generative Pre-trained Transformer) application trained on concepts provided by ancient Ayurvedic sages such as Sushruta, Charak, and others.

Solution Description:

  1. Gather Data: Collect a diverse dataset of texts, scriptures, and teachings from ancient Ayurvedic texts, including those authored by Sushruta, Charak, and other sages.
  2. Preprocess Data: Clean and preprocess the collected data to remove noise, standardize text formats, and prepare it for training.
  3. Train GPT Model: Utilize the preprocessed data to train a GPT model, ensuring that it captures the nuances and intricacies of Ayurvedic principles, treatments, and philosophies.
  4. Fine-tune Model: Fine-tune the trained GPT model on specific tasks or domains within Ayurveda, such as diagnosis, treatment recommendations, herbal remedies, etc.
  5. Develop Application: Build an intuitive and user-friendly application interface that allows users to interact with the trained GPT model.
  6. Deploy Application: Deploy the Ayurveda GPT application on a suitable platform, making it accessible to users.

Alternatives Considered:

  1. Utilizing existing Ayurveda datasets: Instead of collecting data from scratch, leverage existing datasets of Ayurvedic texts and teachings.
  2. Transfer learning: Instead of training a GPT model from scratch, use transfer learning techniques to adapt pre-trained language models to the Ayurveda domain.

Additional Context:
The completion of this feature will be determined by the successful development and deployment of the Ayurveda GPT application, as well as its usability and effectiveness in providing accurate and valuable insights into Ayurvedic principles and practices.

Chest X-rays: Pneumonia Detection | 1. Dataset Preparation

Step 1:

  1. Add relevant folder with readme in the repo
  2. Can explore datasets; btw we can start with https://www.kaggle.com/datasets/paultimothymooney/chest-xray-pneumonia
  3. Explore classes and their respective sizes in dataset and perform necessary augmentation.
  4. Decide upon the number of samples nearly for each class and relevant strategy for noise removal,augmentation(be careful about this).
  5. Upload augmented dataset on kaggle(easy to work with later).

These all steps make us ready to start working on a proper dataset.

Cirrhosis Prediction | 2. EDA

Step 2:

  1. Perform Univariate and Multivariate analysis and draw conclusions from there.
  2. Explore Correlation Matrix(can try different methods and search if they give the same conclusion and why)
  3. Check the distribution(skewness) of the columns
  4. Detect Outliers(don't remove)
  5. Detect Class Label Imbalance

Provide as much relevant graphs and conclusive markdown cells as possible.

@aditi1807

Create a Issue template

I wish to add a issue template for each issues like bug, feature addition, documentation update etc

Maternal Health Risk Prediction | 2. EDA

Step 2:

  1. Perform Univariate and Multivariate analysis and draw conclusions from there.
  2. Explore Correlation Matrix(can try different methods and search if they give the same conclusion and why)
  3. Check the distribution(skewness) of the columns
  4. Detect Outliers(don't remove)
  5. Detect Class Label Imbalance

Provide as much relevant graphs and conclusive markdown cells as possible.

PCOS Detection | 1. Dataset Exploration

@Piyushseth55 Would you like to take it up?

Step 1

  1. Load the dataset
  2. Explore and confirm features and label(s) of this dataset
  3. Explore size/shape of dataset
  4. Investigate data type of features and labels and chose any better option for a particular column for data type if possible
  5. Calculate the memory usage differences
  6. Explore the statistical facts like mean, median, x percentiles of the columns

Breast Cancer | 2. EDA

Is your feature request related to a problem? Please describe.

Breast Cancer EDA

Describe the solution you'd like

EDA

Describe alternatives you've considered

No response

Additional context

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct

Cardiovascular Heart Disease Prediction | 2. EDA

Step 2:

  1. Perform Univariate and Multivariate analysis and draw conclusions from there.
  2. Explore Correlation Matrix(can try different methods and search if they give the same conclusion and why)
  3. Check the distribution(skewness) of the columns
  4. Detect Outliers(don't remove)
  5. Detect Class Label Imbalance

Provide as much relevant graphs and conclusive markdown cells as possible.

@deedGhost Please proceed with this as Step 1 is done.

Cirrhosis Prediction | 1. Explore Dataset

Step 1

  1. Load the dataset
  2. Explore and confirm features and label(s) of this dataset
  3. Explore size/shape of dataset
  4. Investigate data type of features and labels and chose any better option for a particular column for data type if possible
  5. Calculate the memory usage differences
  6. Explore the statistical facts like mean, median, x percentiles of the columns

@aditi1807 Please proceed if you would like to take it up.

Body Fat Prediction | 2. EDA

Step 2:

  1. Detect Outliers(don't remove)
  2. Detect Class Label Imbalance

Provide as much relevant graphs and conclusive markdown cells as possible.
Proceed here after #1

Brain Tumor MRI Classification

Being a GSSOC'24 contributor, I am looking forward to proceed with "Brain Tumor MRI Classification" focusing on MRI images. The dataset will undergo preprocessing steps such as cleaning (if required, suppose the image data contains some salt and pepper noise, it can be removed using Median filter), and feature extraction. For modeling, I will utilize convolutional neural networks (CNNs), well-suited for image classification tasks.

Thanks!

Infant Health Prediction | 1. Explore Dataset

Step 1

  1. Load the dataset
  2. Explore and confirm features and label(s) of this dataset
  3. Explore size/shape of dataset
  4. Investigate data type of features and labels and chose any better option for a particular column for data type if possible
  5. Calculate the memory usage differences
  6. Explore the statistical facts like mean, median, x percentiles of the columns

@Basma2423 Please proceed.

Missing Code of Conduct File in Repository

Currently, the repository lacks a Code of Conduct file, which is an essential component for fostering a healthy and inclusive open-source community. A Code of Conduct serves as a guideline for expected behaviour, ensuring that contributors and participants feel safe, respected, and valued within our community space.

Please assign this issue to me.

Lung Cancer Detection | 1. Dataset Exploration

Hello @SrijanShovit
I am one of the contributors to GSSOC'24. I would like to contribute to the Lung Cancer Detection project by doing EDA and then making ML models for the prediction. I will be using matplotlib and plotly for the analysis, and for modelling I will be using algorithms like LR, RF , XGBClassifier.

Hoping to get a positive response.

Thank you

Breast Cancer | 3. Statistical Feature Importance

Is your feature request related to a problem? Please describe.

Feature importance

Describe the solution you'd like

Statistical Methods

Describe alternatives you've considered

@vedanshipathak Proceed here.

Additional context

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct

Pneumonia Detection from Chest X-rays

Pneumonia Detection from Chest X-rays

Description

Pneumonia is a common and potentially life-threatening condition, particularly among vulnerable populations such as children, the elderly, and individuals with weakened immune systems. Chest X-rays are a common imaging modality used to diagnose pneumonia, but interpretation can be subjective and time-consuming for radiologists. By developing a machine learning model to automatically detect pneumonia from chest X-ray images, this project aims to assist radiologists in triaging cases, speeding up diagnosis, and improving patient outcomes.

Adding a pneumonia detection project would further expand the scope of this repository and provide a valuable resource for healthcare professionals and researchers working in the field of medical imaging and diagnosis.

@SrijanShovit I will be really thankful if you kindly assign me this issue as part of GSSoC'24

Tuberculosis Classification DL | 1. Dataset Prep

Is your feature request related to a problem? Please describe.

Tuberculosis is a conveyance illeness that occurs ailing health and death of millions So i would like to Create a classification model using Deep learning techniques

Describe the solution you'd like

X-ray examination is cosidered to be the most commonly used because of its low cost , wide range of application and fast speed so going to detect the features from X-ray is ours top priority

Describe alternatives you've considered

Going to proposed a model with deep learning algorithm CNN with some basics layers like conv2D,maxpooling2d,flattendense or we can use Yolo as well or use which works best For this tuberculosis classification we are going to use chest X-Ray image dataset and aim to achieve high accuracy

Additional context

Steps to be followed:-

1-Dataset preprocessing and Some preps( includes gathering , cleaning ,understanding , choosing best dataset ,etc)
2-feature engineering and Image preprocessing
3-Augumention (flipping ,cropping ,resizing, rotating ,etc)
4-train and test split
5-train best model using Training set
6-Validate the model using test set
7-Accuracy ,mAps,Confusion matrix
8- Now inferencing , testing in actual environment

Code of Conduct

  • I agree to follow this project's Code of Conduct

Stroke Prediction | 1. Explore Dataset

Step 1

  1. Load the dataset
  2. Explore and confirm features and label(s) of this dataset
  3. Explore size/shape of dataset
  4. Investigate data type of features and labels and chose any better option for a particular column for data type if possible
  5. Calculate the memory usage differences
  6. Explore the statistical facts like mean, median, x percentiles of the columns

@Divyanshi1002 Please proceed.

Infant Health | EDA

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Perform EDA on the Infant Health Dataset
The data of infant health is ready and i would like to perform EDA on it.

Details
Perform univariate and multivariate analysis, outlier detection and possibly removal if you allow and other all standard EDA processes.
Feature selection.
Feature extraction - Will try to create new features for best accuracy

I also want to train a ML and DL algo using Linear Regression, Decision Tree, XG Boost, and neural network.

Compare all the models and determine which model will perform best with particular data

I am GSSOC contributor.

Body Fat Prediction | 3. Feature Importance | 3.3 ML Based Feature Importance Tests

Is your feature request related to a problem? Please describe.

Related to highlighting features importance in this dataset.

Describe the solution you'd like

Explore techniques like RFE, DT and others (at least 5).

Describe alternatives you've considered

No response

Additional context

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct

Stroke Prediction | 2. EDA

Step 2:

  1. Perform Univariate and Multivariate analysis and draw conclusions from there.
  2. Explore Correlation Matrix(can try different methods and search if they give the same conclusion and why)
  3. Check the distribution(skewness) of the columns
  4. Detect Outliers(don't remove)
  5. Detect Class Label Imbalance

Provide as much relevant graphs and conclusive markdown cells as possible.

Once finished #16, proceed with this @Divyanshi1002

explorative data analysis

I want to add some more modifications to the existing notebooks regarding explorative data analysis such as data normalization, decreasing skewness and trying more models like random forests, naive bayes etc
please assign this issue to me under gssoc 2024
I can contribute to this project

Metabolic Syndrome Prediction | 2. EDA

Step 2:

  1. Perform Univariate and Multivariate analysis and draw conclusions from there.
  2. Explore Correlation Matrix(can try different methods and search if they give the same conclusion and why)
  3. Check the distribution(skewness) of the columns
  4. Detect Outliers(don't remove)
  5. Detect Class Label Imbalance

Provide as much relevant graphs and conclusive markdown cells as possible.
after #5, proceed here @Arihant-Bhandari

Period Tracker to detect PCOS

Hi @SrijanShovit
The purpose of this issue is to build and enhance period tracker application to better support individuals with PCOS or to detect PCOS.
Could you assign me this issue under GSSOC'24?

Contributors Highlight

Is your feature request related to a problem? Please describe.

To highlight handles of contributors

Describe the solution you'd like

Script that picks the contributors from repo and fetches their linkedin, twitter, github profiles and displays in tabular form.

Describe alternatives you've considered

No response

Additional context

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct

CTG Data :Featal Health Classification | 1. Data preparation And Pipeline Creation

Hi @SrijanShovit I would like to work on this Dataset These are proposed steps :

1.Exploring dataset , EDA , class identifications
2.Visualizing the dataset and corresponding classes to gain insights.
3.Imputing Outliers and NULL values
4.Normalization of skewed values and One Hot coding Text attributes
3.Comparing KNN.Random Forest and SVC
4.Validatiing the Evaluated models on the validation set using metrics like accuracy, precision, recall, and F1 score
5.Including visualizations (e.g., training curves, confusion matrices) to better illustration of the model performance.

Further :

Creation of Automated Pipeline and Custom Tranformer For the Datset each steps proposed .

Feat: Automate greeting using Github bot πŸ€–

Describe the feature

As the contributor count rises on the repo, it becomes increasingly challenging for maintainers to personally greet and encourage each contributor for their valuable input. Equally important is the reminder for them to review the project's contribution guidelines.

Add ScreenShots

PR greeting message ⬇️

CleanShot_2024-05-11_at_14 39 402x

ISSUE greeting message ⬇️

CleanShot_2024-05-11_at_14 40 482x

Record

  • I agree to follow this project's Code of Conduct
  • I'm a GSSoC'24 contributor
  • I want to work on this issue

Diabetes Classification | 1. Dataset Exploration

Hey @SrijanShovit
I am Anurag , I am contributor in GSSoC'24 . I am exploring machine learning and would like to work on the diabetes classification dataset.
My Approach:
I would begin my process by creating in-depth plots between the several features of a diagnosis to predict which parameters are relevant to the diagnosis. Based on my deductions I would use the chosen parameters to create a classification model using either Logistic Regression or Random Forest Classifier, making changes if necessary along the way.
I hope you find my approach useful and assign me this project under GSSoC'24.

Thank You,
Anurag

Hepatitis C Prediction | 1. Explore Dataset

Step 1

  1. Load the dataset
  2. Explore and confirm features and label(s) of this dataset
  3. Explore size/shape of dataset
  4. Investigate data type of features and labels and chose any better option for a particular column for data type if possible
  5. Calculate the memory usage differences
  6. Explore the statistical facts like mean, median, x percentiles of the columns

@SanjanaBankar Please proceed.

Maternal Health Risk Prediction | 1. Explore Dataset

Step 1

  1. Load the dataset
  2. Explore and confirm features and label(s) of this dataset
  3. Explore size/shape of dataset
  4. Investigate data type of features and labels and chose any better option for a particular column for data type if possible
  5. Calculate the memory usage differences
  6. Explore the statistical facts like mean, median, x percentiles of the columns

Infant Health Prediction | 2. EDA

Step 2:

  1. Perform Univariate and Multivariate analysis and draw conclusions from there.
  2. Explore Correlation Matrix(can try different methods and search if they give the same conclusion and why)
  3. Check the distribution(skewness) of the columns
  4. Detect Outliers(don't remove)
  5. Detect Class Label Imbalance

Provide as much relevant graphs and conclusive markdown cells as possible.

PCOS Detection | 2. EDA

Is your feature request related to a problem? Please describe.

PCOS Detection EDA to investigate data and generate insights.

Describe the solution you'd like

Step 2:

  1. Perform Univariate and Multivariate analysis and draw conclusions from there.
  2. Explore Correlation Matrix(can try different methods and search if they give the same conclusion and why)
  3. Check the distribution(skewness) of the columns
  4. Detect Outliers(don't remove)
  5. Detect Class Label Imbalance

Provide as much relevant graphs and conclusive markdown cells as possible.

Describe alternatives you've considered

No response

Additional context

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct

Stroke Prediction

Hello @SrijanShovit,
I would like to contribute for Stroke Prediction under GSSOC'24, by implementing the following methods for Stroke prediction:

Logistic Regression: It an appropriate choice for predicting stroke likelihood (0 if no stroke, 1 if stroke).

Random Forest: Multiple decision trees to improve predictive performance. Given the variety of attributes in the dataset and the potential non-linear relationships between predictors and stroke likelihood.

Please assign me this issue under label GSSOC'24.
Looking forward to contribute in this project!

Best Regards,
Divyanshi
GSSOC'24 contributor

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.