Coder Social home page Coder Social logo

data_science_learning's Introduction

Data Science Learning

Autoencoder

Convolutional Autoencoder

Project by Akshay Bahadur:
https://github.com/akshaybahadur21/FaceEncoder

AutoML and AutoKeras

A new library in Python: import ai
AutoML by H2O:
https://github.com/h2oai/h2o-tutorials/tree/master/h2o-world-2017/automl
http://docs.h2o.ai/h2o/latest-stable/h2o-docs/automl.html
AutoKeras:
https://autokeras.com/
https://github.com/jhfjhfj1/autokeras
A blog by Favio:
https://towardsdatascience.com/auto-keras-or-how-you-can-create-a-deep-learning-model-in-4-lines-of-code-b2ba448ccf5e
A blog from H2O:
https://www.h2o.ai/blog/the-different-flavors-of-automl/

Betterment Projects

For traffic management:
https://www.siemens.com/innovation/en/home/pictures-of-the-future/infrastructure-and-finance/smart-cities-ai-based-traffic-control.html
https://www.alibabacloud.com/et/city

Blogs and Websites for DS Learning

Once again curated list by Shivan Panchal:
https://www.linkedin.com/feed/update/urn:li:activity:6432565071913807872
Blogs by Akhil Gupta:
https://medium.com/@guptakhil
Research:
https://deepkapha.ai/ DL Training:
fast.ai

Colab - From Google

How to start with:
https://www.kdnuggets.com/2018/02/google-colab-free-gpu-tutorial-tensorflow-keras-pytorch.html

Computer Vision Projects and Blogs

https://towardsdatascience.com/creating-custom-fortnite-dances-with-webcam-and-deep-learning-9b1a236c1b59
https://www.youtube.com/c/DeepGamingAI
https://github.com/dabasajay/Image-Caption-Generator
https://github.com/bendangnuksung/Image-OutPainting

Bible: https://www.pyimagesearch.com/pyimagesearch-gurus/

Courses

https://digitaldefynd.com/deep-learning-courses-training-tutorial-certifications/

Data Mining

https://blogs.systweak.com/2017/03/best-19-free-data-mining-tools/

Data Visualization

Seaborn:
https://elitedatascience.com/python-seaborn-tutorial
Pandas Plots:
https://pandas.pydata.org/pandas-docs/stable/visualization.html
Plotly:
https://plot.ly/python/
Bokeh:
http://nbviewer.jupyter.org/github/bokeh/bokeh-notebooks/blob/master/index.ipynb

Dimensionality Reduction

CPA:
https://medium.com/@aptrishu/understanding-principle-component-analysis-e32be0253ef0 t-SNE
https://lvdmaaten.github.io/tsne/

Decision Trees

By Brandon Rohrer:
https://www.udemy.com/end-to-end-data-science-decision-trees/

EDA

https://www.kaggle.com/randylaosat/simple-exploratory-data-analysis-passnyc
https://www.kaggle.com/infocusp/holographic-view-of-underperforming-schools
https://www.kaggle.com/moizzz/eda-and-clustering
https://www.kaggle.com/hungfei/passnyc-the-magic-of-data-science
https://www.kaggle.com/rishih/present-sir

Embedding

https://www.kaggle.com/colinmorris/embedding-layers

Excel

Training by Yashna Bhatter for Advanced Excel:
https://www.youtube.com/watch?v=e9Vc1wy0eRM&feature=youtu.be

FFN (Flood-filling network)

https://m.medicalxpress.com/news/2018-07-artificial-neural-networks-reveal-brain.html
https://github.com/google/ffn

Free Books

OpenStax is a non-profit organization for sharing books.
https://openstax.org/
e-books:

GAN

Interactive visualization tool for learning GAN:
https://poloclub.github.io/ganlab/
A short blog on Medium:
https://medium.com/@devnag/generative-adversarial-networks-gans-in-50-lines-of-code-pytorch-e81b79659e3f

Gesture Detection

https://medium.com/@choxi/gesture-detection-using-tensorflowjs-cedfdc9dbab6

Graph Data Structures

https://www.geeksforgeeks.org/graph-data-structure-and-algorithms/

Hyper-parameters Tuning

DeepRelay is an interesting Python package to to replay the process of training with visualization:
https://towardsdatascience.com/hyper-parameters-in-action-introducing-deepreplay-31132a7b9631
https://github.com/dvgodoy/deepreplay

Interpretability

https://www.linkedin.com/feed/update/urn:li:activity:6425941973617729536

Kaggle Competitions to Practice

A list by Shivam:
https://www.linkedin.com/feed/update/urn:li:activity:6423992175910051840

Winning Competitions by SRK:
https://www.kaggle.com/sudalairajkumar/winning-solutions-of-kaggle-competitions

Keras

https://developer.ibm.com/articles/cc-get-started-keras/
https://keras.io/
https://elitedatascience.com/keras-tutorial-deep-learning-in-python
https://www.kdnuggets.com/2017/10/seven-steps-deep-learning-keras.html
https://machinelearningmastery.com/introduction-python-deep-learning-library-keras/

LDA (Latent Dirichlet Allocation)

https://medium.com/@lettier/how-does-lda-work-ill-explain-using-emoji-108abf40fa7d
https://rare-technologies.com/what-is-topic-coherence/
https://www.machinelearningplus.com/nlp/topic-modeling-gensim-python/

Mathematics

A great article by Tirthajyoti Sarkar:
https://towardsdatascience.com/essential-math-for-data-science-why-and-how-e88271367fbd

An article by Vincent Chen:
https://blog.ycombinator.com/learning-math-for-machine-learning/

Monte Carlo Tree Search

Blogs:
https://www.kdnuggets.com/2017/12/introduction-monte-carlo-tree-search.html
https://jeffbradberry.com/posts/2015/09/intro-to-monte-carlo-tree-search/

Neural Network Investigation

innvestigate package:
https://github.com/albermax/innvestigate
lucid package:
https://github.com/tensorflow/lucid

NLP

A great repository for all NLP tasks and information.
http://nlpprogress.com/ A GitHub repository of NLP knowledge Han Xiao:
https://github.com/hanxiao/tf-nlp-blocks
Another awesome tutorial on NLP:
https://blog.insightdatascience.com/how-to-solve-90-of-nlp-problems-a-step-by-step-guide-fda605278e4e A great blog on GluonNLP toolkit:
https://medium.com/apache-mxnet/gluonnlp-deep-learning-toolkit-for-natural-language-processing-98e684131c8a
Text pre-processing and exploration:
https://towardsdatascience.com/what-app-descriptions-tell-us-text-data-preprocessing-in-python-afc7ed88360d
Transformer:
https://ai.googleblog.com/2018/08/moving-beyond-translation-with.html
https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/models/research/universal_transformer.py
Finetuning in NLP:
https://finetune.indico.io/
Text Classification:
https://developers.google.com/machine-learning/guides/text-classification/
Text Processing:
https://towardsdatascience.com/what-app-descriptions-tell-us-text-data-preprocessing-in-python-afc7ed88360d
Blog to start with:
https://www.kdnuggets.com/2018/06/getting-started-natural-language-processing.html

Object Detection

https://www.learnopencv.com/deep-learning-based-object-detection-using-yolov3-with-opencv-python-c/

OpenCV

https://www.learnopencv.com/deep-learning-based-object-detection-using-yolov3-with-opencv-python-c/

Overall

A repository of a graduate student Shervine in Stanford:
https://stanford.edu/~shervine/teaching/cs-229.html
Getting started with AI
https://montrealartificialintelligence.com/academy/
A great github respository of notes by Abhini Shetye:
https://gist.github.com/abhinishetye/38ece3fe86d695593bb39fd24003aebb
Machine Learning traning by Europian Space Agency:
https://github.com/jmartinezheras/2018-MachineLearning-Lectures-ESA
Guide by JP Morgan:
https://news.efinancialcareers.com/uk-en/285249/machine-learning-and-big-data-j-p-morgan
Bible on computer vision:
https://www.pyimagesearch.com/
Curated list of resources by Shivam Panchal:
https://www.linkedin.com/feed/update/urn:li:activity:6429636841338740736
https://www.linkedin.com/feed/update/urn:li:activity:6430354635386687489
https://www.linkedin.com/feed/update/urn:li:activity:6434061314544500736
List of ML interpretability resources:
https://github.com/jphall663/awesome-machine-learning-interpretability
Videos by Brandon Rohrer:
http://brohrer.github.io/
https://www.youtube.com/user/BrandonRohrer/videos
Deep Learning by Lex Fridman:
https://lexfridman.com/
A great site:
https://elitedatascience.com/start-here
Check this out:
https://www.floydhub.com/
Data Science Resources selected by Shujian:
https://github.com/Shujian2015/FreeML/blob/master/README.md
Bible of Deep Learning:
http://wiki.fast.ai/index.php/Main_Page

Pandas

The tricks are really great.
https://realpython.com/python-pandas-tricks/ Awesome cheetsheet:
https://medium.com/@msalmon00/helpful-python-code-snippets-for-data-exploration-in-pandas-b7c5aed5ecb9

Podcasts

HoDS by Kate Strachnyi:
http://storybydata.com/humans-of-data-science-hods/
Call of Data by IDEAS and Data Application Lab:
http://www.claoudml.co/
Super Data Science by Kirill Eremenko:
https://www.superdatascience.com/podcast/
Data Science Office Hours:
https://www.youtube.com/channel/UC5c7r0SlnNmPfqxEyni71FA/videos
Data Skeptic Podcast by Kyle Polich:
https://dataskeptic.com/podcast/
Experian News by Mike Delgado:
http://www.experian.com/blogs/news/author/michael-delgado/
Linear Disregression by Ben Jaffe:
http://lineardigressions.com/
Machine Learning Guide:
http://ocdevel.com/mlg

Python

Curated list of sources by Shivan Panchal:
https://www.linkedin.com/feed/update/urn:li:activity:6423461792915255296
The best I have found so far is by Corey Schafer on YouTube:
https://www.youtube.com/channel/UCCezIgC97PvUuR4_gbFUs5g

Python and Jupyter Tips

https://blog.dominodatalab.com/lesser-known-ways-of-using-notebooks/
http://pynash.org/2013/03/06/timing-and-profiling/

PyTorch

It is always great to start with official documents. Here it goes.
https://pytorch.org/tutorials/index.html
https://pytorch.org/tutorials/beginner/pytorch_with_examples.html#
A great tutotial from Stefan Otte on YouTube:
https://www.youtube.com/watch?v=_H3aw6wkCv0
Notebook:
https://github.com/sotte/pytorch_tutorial/blob/7ed1364c11a50308c93aa0393767cbabfc53f6d4/notebooks/00_index.ipynb

RegEx

https://www.youtube.com/watch?v=onaJjAbfbw0&list=PL8FFE3F391203C98C&index=3&t=0s

Reiforcement Learning

TensorFlow based framework Dopamine is now Open Sourced, that's interesting. Here are the details:
https://github.com/google/dopamine
https://github.com/google/dopamine/blob/master/docs/api_docs/python/index.md
Tutorial with blogs:
https://towardsdatascience.com/introduction-to-various-reinforcement-learning-algorithms-i-q-learning-sarsa-dqn-ddpg-72a5e0cb6287

Simulation of the environment with agents' dynamics. Need to explore more on this.
https://arxiv.org/abs/1808.10692
https://arxiv.org/pdf/1808.10692.pdf

Research

Local SGD instead of Mini Batch GD:
https://arxiv.org/abs/1808.07217
Democratizing Research Papers:
https://distill.pub/
Algorithm that outperformed DL:
https://www.technologyreview.com/s/611568/evolutionary-algorithm-outperforms-deep-learning-machines-at-video-games/
Neural Image Assessment:
https://arxiv.org/pdf/1709.05424.pdf

RGF

Blog with code:
https://www.analyticsvidhya.com/blog/2018/02/introductory-guide-regularized-greedy-forests-rgf-python/

SciPy

Tool and techniques for Scientific Python Ecosystem. It covers most of the topics:
http://www.scipy-lectures.org/

Self-Driving

https://selfdrivingcars.mit.edu/deeptraffic-visualization/

Statistics

Starting with basic:
https://medium.com/codezillas/statistics-review-for-data-scientists-and-management-df8f94760221
One of the best books out there:
https://web.stanford.edu/~hastie/ElemStatLearn//
Fundamental Statistics:
https://sites.google.com/site/fundamentalstatistics/chapter1
Resourceful blogs from Ankit Rathi are really helpful:
https://towardsdatascience.com/@rathi.ankit
Statistics Online:
http://onlinestatbook.com/2/

Style Transfer

Implementation by Siraj Raval:
https://www.youtube.com/watch?v=YoBEGQD3LCc

TensorFlow

Google Machine Learning crash course uses TensorFlow framework. It is great to start from here:
https://developers.google.com/machine-learning/crash-course/
Official documentation for TensorFlow is a great place to start with as well. It lets us run the code on Colab with GPU for free :)
https://www.tensorflow.org/tutorials/keras/basic_classification
A great blog on books and tutorials:
https://www.analyticsindiamag.com/top-10-free-books-and-resources-for-learning-tensorflow/

Time Series

A list of collections by Shivam Panchal:
https://www.linkedin.com/feed/update/urn:li:activity:6434061314544500736
A paper to study:
https://arxiv.org/ftp/arxiv/papers/1302/1302.6613.pdf
Paper: LSTM for TimeSeries:
https://www.researchgate.net/publication/221079584_Applying_LSTM_to_Time_Series_Predictable_through_Time-Window_Approaches
A python package to automatically calculate features for time series:
https://tsfresh.readthedocs.io/en/latest/
Prophet for forecasting:
https://facebook.github.io/prophet/docs/quick_start.html
https://github.com/luke14free/pm-prophet

vid2vid

https://github.com/NVIDIA/vid2vid

YOLO

In R:
https://heartbeat.fritz.ai/object-detection-in-just-3-lines-of-r-code-using-tiny-yolo-b5a16e50e8a0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.