Coder Social home page Coder Social logo

python_eda's Introduction

Stars Badge Forks Badge Pull Requests Badge Issues Badge GitHub contributors

Don't forget to hit the ⭐ if you like this repo.

About Us

The information on this Github is part of the materials for the subject High Performance Data Processing (SECP3133). This folder contains general Exploratory Data Analysis (EDA) information as well as EDA case studies using Malaysian datasets. This case study was created by a Bachelor of Computer Science (Data Engineering), Universiti Teknologi Malaysia student. In addition, my research group also contributed materials and case studies. Thank you to the collaborators who shared their knowledge in this github.

Exploratory Data Analysis

✅️ EDA involves using graphics and visualizations to explore and analyze a data set. The goal is to explore, investigate and learn, as opposed to confirming statistical hypotheses.

✅️ EDA is used by data scientists to analyze and explore datasets and summarize the main characteristics of them.

✅️ EDA makes it easier for data scientists to discover patterns, spot anomalies, test a hypothesis, or check assumptions.

✅️ EDA is primarily used to provide a better understanding of dataset's variables and their relationships.

✅️ EDA can also help determine whether the statistical techniques you are considering are appropriate for data analysis.

✅️ Developed by the American mathematician John Tukey in the 1970s, EDA techniques continue to be a widely used method in the data exploration process today.

Why is EDA so important in data science?

✅️ The main purpose of EDA is to help you look at the data before making any assumptions. In addition to better understanding the patterns in the data or detecting unusual events, it also helps you find interesting relationships between variables.

✅️ Data scientists can use exploratory analysis to ensure that the results they produce are valid and relevant to desired business outcomes and goals.

✅️ EDA also helps stakeholders by verifying that they are asking the right questions.

✅️ EDA can help to answer questions about standard deviations, categorical variables, and confidence intervals.

✅️ After the exploratory analysis is completed and the predictions are determined, its features can be used for more complex data analysis or modeling, including machine learning.

📖 Notes

Basic Concept

Code & Practice

Videos

Kaggle: Notebook

Github

📖 Lab

No Dataset Colab GitHub
1 Boston Open in Colab Open in GitHub
2 Car Features and MSRP Open in Colab Open in GitHub
3 Housing Dataset Open in Colab Open in GitHub
4 United Nations Development Corporation Open in Colab Open in GitHub

🌟 Case Study: Exploratory Data Analysis

Team Title Colab GitHub
404 Error Property in Kuala Lumpur Open in Colab Open in GitHub
Alrite The Exportation of Plantation in Sarawak Open in Colab Open in GitHub
BEFE Covid-19 Clusters in Malaysia Open in Colab Open in GitHub
Boboiboy Property Listings in Kuala Lumpur Open in Colab Open in GitHub
COLBY Malaysia GE-14 Result Open in Colab Open in GitHub
FANTOM Daily recorded COVID-19 cases at state level In Malaysia Open in Colab Open in GitHub
HAHA Foreign Direct Investment In Malaysia Open in Colab Open in GitHub
HD Guna Tanah Tampin 2021 Open in Colab Open in GitHub
KIA Malaysia State Election 2018 Open in Colab Open in GitHub
LAB Malaysia Air Pollution Analysis Open in Colab Open in GitHub
MAAM Malaysia Hospital Patient Movement Analysis Open in Colab Open in GitHub
MEOW Capacity and utilisation of Intensive Care Unit (ICU) beds during COVID-19 Open in Colab Open in GitHub
MM Malaysia's 14th State Election Result Open in Colab Open in GitHub
PIXALATED Number of deaths in Malaysia from 2001 to 2018 Open in Colab Open in GitHub
POTATO Death by state, sex and age group Malaysia 2001-2018 Open in Colab Open in GitHub
QnX Real Estate Kuala Lumpur Malaysia Open in Colab Open in GitHub
SAMVERSE Restaurant Rating in Malaysia Open in Colab Open in GitHub
SMOL Population in Malaysia from 2010-2019 Open in Colab Open in GitHub
SQ Number of Cases and Incidents Rate of Communicable Disease by State Open in Colab Open in GitHub
TUK Number of Government School Pupils by District Education Office and State 2017-2018 Open in Colab Open in GitHub
UWU Property Listings in Kuala Lumpur Open in Colab Open in GitHub

🌟 Case Study

Name Title Colab GitHub
Li Jing ABC Open in Colab Open in GitHub
Saleh Dhekre Saber Saleh ABC Open in Colab Open in GitHub
Eman Al Jabarti ABC Open in Colab Open in GitHub
Anwar Said Salim Al Talaii ABC Open in Colab Open in GitHub
Zhu Caihua ABC Open in Colab Open in GitHub
Shiekhah AL Binali ABC Open in Colab Open in GitHub
Li Haopeng ABC Open in Colab Open in GitHub

Contribution 🛠️

Please create an Issue for any improvements, suggestions or errors in the content.

You can also contact me using Linkedin for any other queries or feedback.

python_eda's People

Contributors

drshahizan avatar nellyexey avatar peiyu00 avatar terence172 avatar chongkz29 avatar madinasuraya avatar maizatulafrina avatar samsamsambal avatar diniehazim avatar racquelmae avatar nursyamalia avatar sakinahalizzah avatar afifhazmie avatar tanyongsheng728 avatar myzanazifah avatar izzahmardhiah avatar yejui626 avatar mikheladam avatar jrkong2001utm avatar mqilee avatar adrinaasyiqin avatar jokeryde avatar aimanhafizi619 avatar farrahinutm avatar rishmafathima avatar yongzy328 avatar madihah04 avatar ongwah avatar kelvinnn-2 avatar prowong01 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.