data-centric-ai-community / awesome-python-for-data-science Goto Github PK

A curated list of awesome resources such as books, tutorials, courses, open-source libraries, exercises, and other materials that support Pythonistas in the making, and Pythonistas migrating into Data Science! 📊

Home Page: http://discord.com/invite/mw7xjJ7b7s

Python 0.01% Dockerfile 0.01% Jupyter Notebook 96.59% HTML 3.41%

data-science exercises learn-to-code learning-by-doing learning-python learning-resources machine-learning python awesome-list data-quality

awesome-python-for-data-science's People

Contributors

Stargazers

Watchers

Forkers

ben-luc-hol jovasquez84 dhinojosa93 sivaarwin gergues adamrossnelson jessehenson gkrintir chonkcheto axikop eniayejudaniel keshavaspanda abeebadeniyi willmartell

awesome-python-for-data-science's Issues

Data Visualization with Matplotlib and Seaborn

Description

Craft a tutorial on data visualization using Matplotlib and Seaborn. Show beginners how to create various types of plots and charts to explore and present data.

Acceptance Criteria

Submit a Jupyter notebook containing the tutorial and the necessary datasets if need
Modify the README.md file to include the new tutorial and a link to the added notebook

Hyperparameter Tuning with Grid Search

Description

Develop a tutorial showcasing how to optimize machine learning models by tuning hyperparameters using grid search.

Acceptance Criteria

Submit a Jupyter notebook containing the tutorial and the necessary datasets if need
Modify the README.md file to include the new tutorial and a link to the added notebook

How to Scale Numerical Data

Description

Create a Jupyter Notebook tutorial illustrating the importance of scaling numerical data for machine learning. Explore techniques like standardization and min-max scaling to preprocess and normalize numeric features.

Acceptance Criteria

Submit a Jupyter notebook containing the tutorial and the necessary datasets if need
Modify the README.md file to include the new tutorial and a link to the added notebook

Missing Data Imputation with Machine Learning Methods

Description

Create a Jupyter Notebook tutorial that demonstrates different machine learning methods for effectively handling missing data in datasets.

Acceptance Criteria

Submit a Jupyter notebook containing the tutorial and the necessary datasets if need
Modify the README.md file to include the new tutorial and a link to the added notebook

How to Transform Numerical to Categorical Data

Description

Craft a Jupyter Notebook tutorial explaining how to convert numerical data into categorical format. Illustrate use cases and methods for creating meaningful categories from continuous variables.

Acceptance Criteria

Submit a Jupyter notebook containing the tutorial and the necessary datasets if need
Modify the README.md file to include the new tutorial and a link to the added notebook

How to Make Distributions More Gaussian

Description

Develop a Jupyter Notebook tutorial on transforming non-Gaussian distributions into more Gaussian-like ones. Explore various techniques like log transformations and others to enhance the distribution of data.

Acceptance Criteria

Submit a Jupyter notebook containing the tutorial and the necessary datasets if need
Modify the README.md file to include the new tutorial and a link to the added notebook

Simple Linear Regression in Machine Learning

Description

Create a Jupyter Notebook tutorial explaining the concept of simple linear regression and how to perform it in Python for basic predictive modeling.

Acceptance Criteria

Submit a Jupyter notebook containing the tutorial and the necessary datasets if need
Modify the README.md file to include the new tutorial and a link to the added notebook

How to Define Feature Importance

Description

Develop a Jupyter Notebook tutorial explaining the concept of feature importance in machine learning and showcasing some introductory methods.

Acceptance Criteria

Submit a Jupyter notebook containing the tutorial and the necessary datasets if need
Modify the README.md file to include the new tutorial and a link to the added notebook

Introduction to Classification with Scikit-Learn

Description

Develop a beginner-friendly tutorial on classification using Scikit-Learn. Explain the basics of classification algorithms and guide users through building their first classifier.

Acceptance Criteria

Submit a Jupyter notebook containing the tutorial and the necessary datasets if need
Modify the README.md file to include the new tutorial and a link to the added notebook

How to Encode Categorical Data

Description

Create a Jupyter Notebook tutorial that guides beginners through encoding categorical data for machine learning tasks. Cover techniques such as one-hot encoding and others to convert categorical variables into numerical form.

Acceptance Criteria

Submit a Jupyter notebook containing the tutorial and the necessary datasets if need
Modify the README.md file to include the new tutorial and a link to the added notebook

Introduction to Time Series Analysis

Description

Create a tutorial that introduces beginners to time series data analysis. Cover basic concepts and simple forecasting techniques.

Acceptance Criteria

Submit a Jupyter notebook containing the tutorial and the necessary datasets if need
Modify the README.md file to include the new tutorial and a link to the added notebook

Handling Imbalanced Datasets with Undersampling

Description

Create a tutorial that introduces beginners to addressing imbalanced datasets by undersampling the majority class. Explain techniques like random undersampling, and others.

Acceptance Criteria

Submit a Jupyter notebook containing the tutorial and the necessary datasets if need
Modify the README.md file to include the new tutorial and a link to the added notebook

Handling Imbalanced Datasets with Oversampling

Description

Craft a tutorial for beginners on addressing imbalanced datasets by oversampling the minority class. Show techniques like random oversampling and synthetic oversampling using SMOTE, or others of your choice!

Acceptance Criteria

Submit a Jupyter notebook containing the tutorial and the necessary datasets if need
Modify the README.md file to include the new tutorial and a link to the added notebook

How to Perform PCA Dimensionality Reduction

Description

Create a Jupyter Notebook tutorial that introduces Principal Component Analysis (PCA) for dimensionality reduction. Explain the concept of PCA, and its applications, and provide a tutorial on how to use it in machine learning projects.

Acceptance Criteria

Submit a Jupyter notebook containing the tutorial and the necessary datasets if need
Modify the README.md file to include the new tutorial and a link to the added notebook

Classification Metrics in Machine Learning

Description

Develop a tutorial on common classification metrics in machine learning, such as accuracy, precision, recall, and F1-score. Explain when to use each metric and how to calculate them.

Acceptance Criteria

Submit a Jupyter notebook containing the tutorial and the necessary datasets if need
Modify the README.md file to include the new tutorial and a link to the added notebook

Implement Feature Engineering Tutorial

Description

create a comprehensive tutorial on feature engineering to help both new and experienced team members understand and apply this crucial aspect of our data science work.

Tasks

Prepare an outline for the feature engineering tutorial, covering essential concepts and techniques.
Write a detailed introduction explaining the importance of feature engineering in our projects.
Provide clear examples of feature engineering methods used in our current project(s).
Include code snippets, demonstrations, and real-world use cases to illustrate the concepts.
Add references to external resources or research papers for further reading.
Include interactive code notebooks (e.g., Jupyter notebooks) that users can experiment with.
Add visuals, such as diagrams or charts, to aid in understanding.
Ensure the tutorial is well-structured, easy to follow, and suitable for both beginners and advanced team members.

Acceptance criteria

Submit a Jupyter notebook containing the tutorial and the necessary datasets if need
Modify the README.md file to include the new tutorial and a link to the added notebook

Python for Data science

Data Analysis of Text based Dataset

Description

it's crucial to perform in-depth data analysis and visualization to gain insights, discover patterns, and make informed decisions. This issue is focused on conducting an of the text data and creating visualizations that will aid our understanding.

Tasks

Acceptance Criteria:

Submit a Jupyter notebook containing the tutorial and the necessary datasets if need
Modify the README.md file to include the new tutorial and a link to the added notebook

How to Perform Cross-Validation

Description

Create a tutorial for beginners that explains the concept of cross-validation in machine learning. Guide users through implementing k-fold cross-validation to assess model performance.

Acceptance Criteria

Submit a Jupyter notebook containing the tutorial and the necessary datasets if need
Modify the README.md file to include the new tutorial and a link to the added notebook

Tutorial on Anomaly Detection

Description

Create a Jupyter Notebook tutorial that implements Anomaly Detection. Explain the concept of Anomaly Detection, and its applications, and provide a tutorial on how to use it in machine learning projects.

Acceptance Criteria

Submit a Jupyter notebook containing the tutorial and the necessary datasets if need
Modify the README.md file to include the new tutorial and a link to the added notebook

Implement Outlier Detection Tutorial

Description:
We need to create a comprehensive tutorial on outlier detection techniques and their practical implementation for our data science community. Outliers can significantly impact our data analysis and machine learning models, and it's essential that our users are well-informed about how to handle them.

Tasks:

Expected Outcome:
Once this issue is completed, we will have a well-documented and informative tutorial on outlier detection. This resource will help our community members gain a better understanding of how to handle outliers in their data science projects.

Acceptance criteria

Submit a Jupyter notebook containing the tutorial and the necessary datasets if need
Modify the README.md file to include the new tutorial and a link to the added notebook

Clustering with K-Means

Description

Create a beginner-friendly tutorial introducing the concept of clustering using K-Means.

Acceptance Criteria

Submit a Jupyter notebook containing the tutorial and the necessary datasets if need
Modify the README.md file to include the new tutorial and a link to the added notebook

Exploratory Data Analysis (EDA) with ydata-profiling

Description

Develop a tutorial that walks beginners through exploratory data analysis using the ydata-profiling library.

Acceptance Criteria

Submit a Jupyter notebook containing the tutorial and the necessary datasets if need
Modify the README.md file to include the new tutorial and a link to the added notebook

Data Augmentation with Synthetic Data using ydata-synthetic

Description

Develop a beginner-friendly tutorial on data augmentation using the ydata-synthetic library. Explain how to generate synthetic data to increase the size of training dataset, improving model performance.

Acceptance Criteria

Submit a Jupyter notebook containing the tutorial and the necessary datasets if need
Modify the README.md file to include the new tutorial and a link to the added notebook

data-centric-ai-community / awesome-python-for-data-science Goto Github PK

awesome-python-for-data-science's People

Contributors

Stargazers

Watchers

Forkers

awesome-python-for-data-science's Issues

Description

Acceptance Criteria

Description

Acceptance Criteria

Description

Acceptance Criteria

Description

Acceptance Criteria

Description

Acceptance Criteria

Description

Acceptance Criteria

Description

Acceptance Criteria

Description

Acceptance Criteria

Description

Acceptance Criteria

Description

Acceptance Criteria

Description

Acceptance Criteria

Description

Acceptance Criteria

Description

Acceptance Criteria

Description

Acceptance Criteria

Description

Acceptance Criteria

Description

Tasks

Acceptance criteria

Description

Tasks

Acceptance Criteria:

Description

Acceptance Criteria

Description

Acceptance Criteria

Description

Acceptance Criteria

Description

Acceptance Criteria

Description

Acceptance Criteria

Recommend Projects

Recommend Topics

Recommend Org