rakesh9100 / ml-project-drug-review-dataset Goto Github PK

This is an innovative machine learning project that utilizes patient reviews with many other attributes to analyze and evaluate the effectiveness of drugs.

Home Page: https://ml-project-drug-review-dataset.streamlit.app

License: Apache License 2.0

Python 100.00%

machine-learning drug-database model project python hacktoberfest gssoc2023 open-source open-source-development open-source-project

ml-project-drug-review-dataset's Issues

Add Random Forest Classification model and other Classifier Boosting models

@Rakesh9100 The aim is to find out the best Classification model. Please assign me this issue.

Implement EDA, Data Preprocessing and Oversampling

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

I want to do Different EDA techniques including Statistical and Mathematical approaches and Using Traditional Models(like SVM, KNN, rtc.) and Deep Learning Models likr (CNN, RNN, etc) for checking performance and doing Data Augmentation with K fold validation to improve accuracy.

Screenshots

No response

Code of Conduct

I follow Contributing Guidelines of this project.

Add Requirements.txt file

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

To maintain a python project we need to create a requirements file to avoid dependency conflicts in the future..
I will create a virtual environment for this project and create a requirements.txt file which can be installed by :
$ pip install -r requirements.txt

Screenshots

No response

Code of Conduct

I follow Contributing Guidelines of this project.

Add a Website Using Flask, HTML, and CSS

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

As described in the README file, this project does not yet have a GUI. I would like to create a website for this using Flask, HTML, and CSS.

Screenshots

No response

Code of Conduct

I follow Contributing Guidelines of this project.

Add other models and algos for better analysis of the dataset

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

Please assign me this issue I am very excited to work on it

Screenshots

No response

Code of Conduct

I follow Contributing Guidelines of this project.

Add a plot that compares various ML model used to predict their accuracy

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

Hello,
I am a GSSoC'23 contributor
I am willing to create a plot that compares the various Machine learning model used in the above code to show their accuracy. This would allow the contributors to know what is the accuracy of each ML and DL algorithm used so that it is easy for them to make predictions and deploy new ML algorithms and test them or improve the statistics of the dataset and see the accuracy.
Please assign it to me.

Screenshots

No response

Code of Conduct

I follow Contributing Guidelines of this project.

Add seaborn.categorical and seaborn.matrix for better visualization

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

Implementation approach to resolve this issue.

seaborn.categorical: This submodule provides functions for visualizing categorical data which makes the project look more better in terms of visualization.
If your drug review dataset contains categorical variables such as drug types, ratings, or side effects, you can use functions like barplot, countplot, or violinplot

seaborn.matrix: This submodule provides functions for visualizing matrices and rectangular datasets.
If your drug review dataset contains tabular data with multiple variables, you can use functions like heatmap to create a color-encoded grid that visualizes patterns or correlations between the variables.

Screenshots

Just for example same thing will be implemented for different drugs.

seaborn.categorical :

seaborn.matrix:

Code of Conduct

I follow Contributing Guidelines of this project.

Add Basic Streamlit UI for the project

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

Hello sit. Arun here. I have worked as ML research intern for AIIMS on behalf of my college and have worked on health research domains. Thus raising this issue. This issue aims to address these concerns and proposes the following improvements to make the model more reliable and efficient.

Handle Random Sampling:

Randomly selecting 80% of the training data may introduce bias and affect model performance. Instead, consider implementing sampling techniques to maintain the class distribution while selecting a subset of given data.

Optimize Memory Usage:

Use pandas' astype function to downcast numeric columns and consider using sparse data structures where applicable to further reduce memory consumption.

HTML Decoding Function:

Evaluate the performance impact of the decode_html function that decodes HTML-encoded characters. Optimize the function, if necessary, to improve efficiency and minimize processing time.

Missing Value Handling:

Assess the impact of using SimpleImputer for handling missing values and explore alternative strategies such as data imputation based on domain knowledge or techniques like K-nearest neighbors (KNN) or IterativeImputer.

Feature Encoding:

Check the effectiveness of alternative feature encoding methods such as one-hot encoding, target encoding, or entity embeddings to capture complex relationships and improve model performance.

Regularisation

Consider implementing regularization techniques like L1 (Lasso) or L2 (Ridge) regularization in the LinearRegression and LogisticRegression models to handle potential overfitting and improve generalization.

Screenshots

No response

Code of Conduct

I follow Contributing Guidelines of this project.

Develop an Interactive Drug Recommendation System

Add Word Embedding and PCA for feature engineering

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

i am a GSSOC'23 contributor,
Instead of using simple tf-idf vectorization with bag of words model, I wish to use word embedding models like Word2Vec, GloVe, FastText for context based vectorization of sentences instead of statistical vectorization. I also wish to reduce the dimension of feature space from 3000, to create a more condensed and easy to learn dataset using principal component analysis.

Screenshots

No response

Code of Conduct

I follow Contributing Guidelines of this project.

Fix minor typo in README

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

Hi I am contributor of GSSOC'23. I noticed a minor typo in README file.

https://github.com/Rakesh9100/ML-Project-Drug-Review-Dataset/blame/main/README.md#L256

In line 256 last word of README.md, the repo name is not correct

- Music-Recommendation-System.
+ ML-Project-Drug-Review-Dataset.

Screenshots

Code of Conduct

I follow Contributing Guidelines of this project.

Improve Random Forest model

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

I want to do OverSampling and UnderSampling on Random Forest classifier.Please assign me this issue.

Screenshots

No response

Code of Conduct

I follow Contributing Guidelines of this project.

Change perceptron model to Multi Layer Perceptron Classifier

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

The perceptron used in this ML model is a single layer neuron with much lesser efficiency of ~0.14
I would like to explore more the effectiveness of neural network model and add some hidden layers between input and output through scikit learn neural network model.

Screenshots

No response

Code of Conduct

I follow Contributing Guidelines of this project.

Change Plot Confusion Matrix to ConfusionMatrixDisplay

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

the method plot_confusion_matrix from sklearn.metrics has been deprecated since scikit-learn version 1.0.
Either use an older version of sklearn and specify it in the requirements.txt file or use the method ConfusionMatrixDisplay from sklearn.metrics
Documentation link : https://scikit-learn.org/stable/modules/generated/sklearn.metrics.ConfusionMatrixDisplay.html#sklearn-metrics-confusionmatrixdisplay

Screenshots

Code of Conduct

I follow Contributing Guidelines of this project.

Implement EDA, preprocessing, feature engineering and model building on the dataset

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

I would like to perform EDA on the given dataset along with preprocessing, feature engineering, and model building phase, and would like to increase the accuracy to >50% which is the present accuracy. I will achieve this through good data-cleaning processes. So, I would like @Rakesh9100, sir, to assign me this issue under GSSoC 2023, so that I can start working.
Thank you

Screenshots

No response

Code of Conduct

I follow Contributing Guidelines of this project.

Implement deep learning algorithm to improve accuracy

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

Most of the time deep learning techniques are better than ML techniques. Maybe trying out a deep learning technique would be better on the dataset. I plan to try Convolution Neural Network(CNN)

Screenshots

No response

Code of Conduct

I follow Contributing Guidelines of this project.

Add XGBoost, LGBM, SVR algorithms

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

I would like to implement XGBoost, LGBM, SVM algorithms for above dataset.

Screenshots

No response

Code of Conduct

I follow Contributing Guidelines of this project.

Change tsv files to csv

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

In the datasets, we have tsv files. I would like to convert them into CSV and add them to the dataset. This would give us the option of working with either tsv or CSV

Screenshots

No response

Code of Conduct

I follow Contributing Guidelines of this project.

Extraction of meaningful data from patient reviews using NLP

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

Greetings!!
I am a contributor to GSSOC'23.
I am very interested in working on the NLP-related works and would like to work on the same.
This issue would focus on using language processing techniques, to exploit meaningful analysis from patient reviews. This will also aim to remove the expressions from the patient review that is difficult for an algorithm to interpret and understand.

Screenshots

No response

Code of Conduct

I follow Contributing Guidelines of this project.

Implement Sentiment Analysis, EDA and Gradient Boost Algorithms

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

I want to perform Sentiment Analysis and EDA like pre-processing, feature engineering and implementing N-grams and stuff like that. I think classification algorithms like XGBoost , CatBoost will increase the accuracy.
I have good knowledge in machine learning algorithms.

Kindly assign me with this issue under GSSOC '23 :)

Screenshots

No response

Code of Conduct

I follow Contributing Guidelines of this project.

Add LSTM Model

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

Hey Everyone!

Myself Archisman | Contributor GSSOC'23

Based on the dataset in the repo, I would recommend using a long short-term memory (LSTM) network.

Here are some specific suggestions for using an LSTM network for the project:

Extract features from the text of the review: The LSTM network can be used to extract features from the text of the review. These features can then be used to train a machine-learning model that can predict the sentiment of the review.
Identify the sentiment of the review: The LSTM network can be used to identify the sentiment of the review. This can be done by training the LSTM network on a dataset of reviews that have been labeled with the sentiment.
Recommend drugs to users based on the sentiment of the review: The LSTM network can be used to recommend drugs to users based on the sentiment of the review. This can be done by training the LSTM network on a dataset of reviews that have been labeled with both sentiment and the drug that was recommended to the user.

By following these suggestions, the LSTM network can be used to improve the accuracy and efficiency of the project.

Screenshots

No response

Code of Conduct

I follow Contributing Guidelines of this project.

Implement web scraping

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

Web scaping on filpkat of mobile under 50000 as it most necessary devices now days.
1.PRODUCT NAME
2. PRICES
3. DESCRIPTION
4. REVIEW

I will do web scraping for like some pages and provide you the SOURCE code and CSV file too.

Screenshots

No response

Code of Conduct

I follow Contributing Guidelines of this project.

Add black linter workflow

Add a suitable linter to ensure proper linting of the code. As a project with multiple users contributing to it, it is essential to maintain a proper linting style to ensure better code readability by users. An excellent way to do this is by automatic linting checks using GitHub actions.

I would like to work on this project as a part of GSSoC 23. Please assign this issue to me.

Add scatter visualization of drug name and ratings

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

@Rakesh9100 Dear Sir,

I am an GSSOC 2023 open source contributor and Data Science enthusiast. I will be creating multiple plot charts for data visualization and accuracy metrics visualization of the data set.

Please kindly assign me this task and It will be a great opportunity for me to work on it.

Thank you

Dhruv Saxena

Screenshots

No response

Code of Conduct

I follow Contributing Guidelines of this project.

Add GridSearchCV for identifying best model

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

As being a gssoc contributor i would like to contribute to this repo, by adding GridSearchCV to the code base that would help us to identify the best model among the available models

Screenshots

No response

Code of Conduct

I follow Contributing Guidelines of this project.

Add RandomForest Model with RandomizedSearchCV

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

I would like to enhance the project by using the RandomForest model accompanied by RandomizedSearchCV. There are other similar issues but haven't come across anywhere RandomForest was mentioned.

Screenshots

No response

Code of Conduct

I follow Contributing Guidelines of this project.

Add contributors section on README

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

Display all the contributors on readme

Screenshots

No response

Code of Conduct

I follow Contributing Guidelines of this project.

Usage of sentimental Analysis on Drug Review

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

Utilization of NLP techniques to process Drug review dataset to calculate Sentiment rating for some new inclusions.

Screenshots

No response

Code of Conduct

I follow Contributing Guidelines of this project.

Fix error with the import of plot_confusion_matrix in collab notebook

Hello,
i would like to replace plot_confusion_matrix with ConfusionMatrixDisplay as it is supported by sklearn on every interpreter.

Add XGBClassifier Model and WordCloud for the Review Column

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

I'm a GSSoC'23 contributor. I would like to work on this ML-project to modify its model using Text Classification techniques as well as constructing Hybrid models.

Screenshots

No response

Code of Conduct

I follow Contributing Guidelines of this project.

Add cross validation techniques

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

I would like to propose adding cross-validation techniques to the drug review project. Currently, the project includes algorithms such as linear regression, logistic regression, perceptron, and decision tree. By incorporating cross-validation, we can obtain more robust evaluations of these algorithms and improve the model's performance.

I suggest using scikit-learn's cross_val_score function and KFold for the cross-validation strategy. This example can serve as a starting point for implementing cross-validation with other algorithms as well.

By incorporating cross-validation, we can better assess the performance of the models, estimate their generalization capabilities, and potentially fine-tune hyperparameters to enhance their accuracy. This addition will make the project more reliable and comprehensive.

Screenshots

No response

Code of Conduct

I follow Contributing Guidelines of this project.

Implement Ensemble Model of Sparse Multi-Layer Perceptron

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

Hello, I am Siddhant Dutta - I am a GSSOC'23 Contributor. I have done a similar task regarding this topic and I think I would be able to contribute greatly to this open-source project. This is my resume - Resume

Would Start Working on the Project - Whenever the Project is assigned to me
Would be done with it - 1month after the project is assigned to me

Tasks I will do -

To implement an ensemble model of sparse multi-layer perceptrons (MLPs) to analyze and evaluate the effectiveness of different drugs in treating specific medical conditions.
These models have proved themselves to work and give a high accuracy for text-based datasets where the text corpus is huge.
The main part of the project lies in the feature engineering section, hence I would provide a complete end-to-end pipeline in Python.
Will do a complete EDA in the initial stage and then would apply different kinds of strategies such as the creation of sparse embeddings and best feature selection to improve the final SMLP model.
After Feature engineering, I would also try out different models such as LightGBM & CatBoost and would test their accuracy of them as well.

Screenshots

No response

Code of Conduct

I follow Contributing Guidelines of this project.

Add the KNN Model

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

Add alternative models such as SVM, KNN etc. and try results

Screenshots

No response

Code of Conduct

I follow Contributing Guidelines of this project.

Implement Web Scraping Using Multiprocessing

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

Hi
I'm Ashis Baidya | Contributor GSSOC'23.
Many libraries and models are available for web scraping of data related to drugs,[in our case patients reviews and drugs effect on them]. There I'm willing to implement python libraries which can increase the efficiency of the web scraping and reduce the amount of time to scrap a data using:- requests,Bs4, html5lib,dask,scrapy and multiprocessing .

Screenshots

No response

Code of Conduct

I follow Contributing Guidelines of this project.

Add Bert Model

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

Hi
I'm Shubham | Contributor GSSOC'23.
Many models are available for performing sentiment analysis(in our case, patient review), which has proven to cause a significant rise in prediction accuracy. Therefore I'm willing to implement the BERT model which can be utilized with the help of the hugging face "transformer" library.

Implementation:

Deep Learning Models such as Bert are known for their great Transformer based Architecture, Bert's architecture needs to be fine-tuned on our Drug Dataset.
hugging face transformers have various functionalities which can ease our model training keeping the code readable.
I suggest training the model with the help of Pytorch library(only for creating CustomDataset) and DataLoaders
All of these points are needed to be coded in a different file than "main.py" so that those unfamiliar with the PyTorch library can work still go through the code base.

Deliverables:

A separate file from "main.py" implementing and training the BERT model for improving the accuracy.
The deep learning model can then be imported into the file "main.py" with just a few lines of code.

issue
Further for visualisation, I suggest using seaborn only as sklearn.metrics.plot_confusion_matrics is deprecated.

Screenshots

No response

Code of Conduct

I follow Contributing Guidelines of this project.

Add some Issue templates

Description:

I noticed that the repository does not have an issue template. Having an issue template can greatly improve the clarity and consistency of issue reports, making it easier for contributors and maintainers to understand and address the problems or feature requests effectively.

here are three templates that I want include:

Issue Template: This template enables users to create well-structured issue reports by providing sections for a concise title, detailed description, steps to reproduce (if applicable), expected behavior, actual behavior, additional information, and environment details.
Feature Request Template: The feature request template allows users to outline their desired features with clarity, including sections for a clear feature description, expected benefits, and any additional context or information.
Documentation Template: This template facilitates the creation of comprehensive documentation by providing a structured format, including sections for an introduction, usage instructions, examples, and other relevant details.

Fix some of the errors

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

Traceback (most recent call last):
File "c:\Users\ISHRITA\AppData\Local\Temp\Temp2_ML-Project-Drug-Review-Dataset-main.zip\ML-Project-Drug-Review-Dataset-main\main.py", line 10, in
from sklearn.metrics import mean_squared_error, r2_score, accuracy_score, confusion_matrix, plot_confusion_matrix
ImportError: cannot import name 'plot_confusion_matrix' from 'sklearn.metrics' (C:\Users\ISHRITA\AppData\Local\Programs\Python\Python310\lib\site-packages\sklearn\metrics_init_.py)

Screenshots

Code of Conduct

I follow Contributing Guidelines of this project.

Fix some errors in main file

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

Error

The main.py has so many errors in it like a basic machine learning model can't predict float values but here y_train and y_test are float which is one of the major errors and the second one which I've seen so far is the problem when fitting the data in linear regression. The third error is in the library itself which is in line 7 plot_confusion_matrix is no more available. The fourth error is in the path or train and test dataset, the way in which they are assigned is not valid.

Solution

I've fixed most of them only 1 remaining

Screenshots

I can also fix this very easily!

So please assign me this issue I'll make all these errors vanish!

Code of Conduct

I follow Contributing Guidelines of this project.

Add CodeQL workflow

Description

Add the codeql semantic analysis workflow for the repo to be enabled on every push, commit and pull request.

I want to work on this issue as a part of GSSoc'23. Please assign this issue to me.

Enhance accuracy by using multiple regression models at a time

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

Hey, am a fellow contributor of GSSOC'23 and would like to implement pycaret module to test and deploy about 25 regression models and select the best model with above 90% accuracy. AutoML is advanced machine learning technique and with appropriate tuning and ensembling, we would get much better insights on drug reviews by different patients.

Screenshots

Code of Conduct

I follow Contributing Guidelines of this project.

Implement data analysis and visualization and use more models

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

Good Evening,
I am a GSSoC'23 contributor
I am willing to perform an in-depth pre-processing and EDA on the dataset, as well as adding more visualisations in order to make the analysis more easy to understand and more effective.
Furthermore, I will use more models like XGBoost, SVM, RandomForest in order to improve the accuracy.
Please assign it to me.

Screenshots

No response

Code of Conduct

I follow Contributing Guidelines of this project.

Improve Accuracy by Exploring Alternative Machine Learning Models

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

As a participant in the Girls Script Summer of Code (GSSoC) program, I have noticed an opportunity to enhance the accuracy of the machine learning model by exploring alternative models. Although Linear Regression, Logistic Regression, Perceptron, and Decision Tree Classifiers have been applied, I believe that considering additional models can help us achieve higher predictive performance.

To address this, I propose the following steps:

Research and Evaluate Alternative Models: Conduct a thorough investigation of various machine learning models suitable for the given task, such as Random Forest, Support Vector Machines (SVM), Gradient Boosting Models (e.g., XGBoost, LightGBM), Neural Networks (e.g., using TensorFlow or PyTorch), Naive Bayes, K-Nearest Neighbors (KNN), and others. Compare their strengths, weaknesses, and suitability for the dataset.
Implement and Train Alternative Models: Select a subset of promising models and implement them in the project. Utilize appropriate libraries and frameworks for each model and train them on the Drug Review Dataset, following best practices and considering proper hyperparameter tuning techniques.
Performance Evaluation and Comparison: Evaluate the performance of the alternative models using appropriate evaluation metrics (e.g., accuracy, precision, recall, F1-score) and compare them against the existing models. Identify the models that show improved accuracy or other desirable performance characteristics.
Fine-tuning and Model Ensemble: If a specific alternative model outperforms the existing ones, consider further fine-tuning its hyperparameters to maximize its potential. Additionally, explore the possibility of creating a model ensemble by combining the strengths of multiple models to achieve even higher accuracy.
Documentation and Reporting: Document the findings, including the performance comparison, insights gained, and recommendations for selecting the most accurate model(s). Provide clear instructions on integrating the recommended models into the project and any modifications required for the existing implementation.

Screenshots

No response

Code of Conduct

I follow Contributing Guidelines of this project.

Implement Patient Condition classification using Drug Reviews

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

Hi I am contributor of GSSOC'23.I Can Improve the model by classifying the condition of patient using from the patient reviews using NLP.

Screenshots

No response

Code of Conduct

I follow Contributing Guidelines of this project.

Modification of word embedding technique from Tf-Idf to Word2Vec

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

I want to apply word2vec embedding technique and observe the corresponding change in accuracy. Along with it, I want to apply GridSearchCV to see if it improves accuracy of the already applied algorithm

Screenshots

No response

Code of Conduct

I follow Contributing Guidelines of this project.

Perform EDA and Implement Deep Learning models

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

As we know Deep learning , based models have surpassed classical machine learning based approaches in various text classification tasks, including sentiment analysis, news categorization, question answering, and natural language inference.

Perform complete EDA to get insights
Use Sampling Techniques
We can implement few deep learning models in order to increase the efficiency of the project to get better outcomes.

Screenshots

No response

Code of Conduct

I follow Contributing Guidelines of this project.

Add an interface using Html, CSS and JS for the project

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

As it has no interface i can make one interface for the same and add some features to it

Screenshots

No response

Code of Conduct

I follow Contributing Guidelines of this project.

Add readme badges

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

I want to add readme badges.Please assign this issue to me

Screenshots

Code of Conduct

I follow Contributing Guidelines of this project.

Change the tsv files to csv

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

The datasets are in the tsv format. I wanted to convert it into CSV because it is the default file type we work with most of the time. Also, the main.py is showing some error that the file is not available so I wanted to see if it works this way.

Screenshots

No response

Code of Conduct

I follow Contributing Guidelines of this project.

Improve accuracy using nlp techniques

Prerequisites

I am an Open Source Contributor
I checked to make sure that this issue has not already been filed

Description

1.We can perform preprocessing for textual data like lower casing ,removing punctuations ,stemming ,lemmatization and perform EDA
2.Use NLP techniques like N-Grams and word cloud to get some idea about people's feedback.
3.Sentimental analysis for patient's feedback by using traditional machine learning algorithms like support vector machine, Logistic Regression in training phase and generate predictions for test data.

Screenshots

No response

Code of Conduct

I follow Contributing Guidelines of this project.

Implement a flask app for the model

Description

Creating a flask app for your model.

Screenshots

No response

Additional information

No response