yaoyaoustc / yaoyaoustc.github.io Goto Github PK
View Code? Open in Web Editor NEW佛系machine learning database
佛系machine learning database
Lazy Machine Learning Learner
https://yaoyaoustc.github.io/2019/11/14/Tecent2019b/#more
In my previous post I summarized the project information and challenges, now let’s take a look at the data.Data downloadFirst, the raw dataset can be downloaded here.
https://yaoyaoustc.github.io/2019/11/26/MelRental-p3/#more
Now let’s try word2vec on the property details page and see if we can find something interesting.load libs123456789import pandas as pdimport numpy as npimport matplotlib.pyplot as pltimport seaborn as
https://yaoyaoustc.github.io/2020/04/03/pyspark-rdd/#more
Pyspark Databricks Exercise: RDD the purpose of this practice is to get a deeper understanding of the properties of RDD. we will not talk about what is rdd and what that means. There are plenty of mat
https://yaoyaoustc.github.io/2019/11/22/Tecent2019-ex1/
Just for some extra fun, Let’s do some plots to explore the ads dataset a bit123456import pandas as pdimport numpy as npimport matplotlib.pyplot as pltimport seaborn as snsimport calendar%matplotlib i
https://yaoyaoustc.github.io/2019/11/19/Tecent2019c/
Just another day of Data cleaning… this dataset really required lots and lots of cleaning…Data prepare: continuedFrom last notebook we’ve get the correct label: next 24 hour exposure rate.
https://yaoyaoustc.github.io/2019/10/22/python-3-iterate-summary/#more
python 3 iteration summary一个小小的总结,summarize一下python3里面的iteration.iteritems()用于 series, dataframe: for index, data in a.iteritems(): self用于不需要index的情况,只对items进行遍历
https://yaoyaoustc.github.io/2019/11/27/CLFR-metrics/#more
Let’s put a summary of the common Classification evaluation metrics. What they mean and how to use them.AccuracyMeaning:Correct identifications / all examples
https://yaoyaoustc.github.io/2020/08/08/myfeedback/#more
On-going workThere are always moments when I’m frustrated and lost my confidence. I think it will be a good idea to document these Kudos and shoutouts that have been given by my teammates, managers, a
https://yaoyaoustc.github.io/2020/07/27/Interview-prep/
Some interview questionsWhat’s the difference between boosting and baggingBagging attempts to reduce the chance overfitting complex models. Bagging, also called bootstrap, is to create subsets from sa
https://yaoyaoustc.github.io/2019/11/14/Tecent2019/
First, Why I choose this project?This Competition is one of the “unlike Kaggle” Competition: This is a real-world problem.The target of this project is to estimate an Ad’s Exposure rate (daily) from a
https://yaoyaoustc.github.io/2019/10/22/python-3-iterate-summary/
Logistic regression/LDA/KNN/QDA Comparison | Lazy Machine Learning Learner
https://yaoyaoustc.github.io/2020/02/12/bttraverse/
12345678910111213def inorderTraversal(self, root): # write your code here results = [] stack = collections.deque([]) cur = root while cur or stack: while
https://yaoyaoustc.github.io/2019/10/30/GradientBoosting-Parameters-Tune/
Tree class comparisons | Lazy Machine Learning Learner
https://yaoyaoustc.github.io/about
Tree class comparisons | Yao's rabbit hole
https://yaoyaoustc.github.io/2019/11/20/pandas-mindmap/#more
When I first started learning Pandas library, I seriously suffered a memory issue:My brain said:It’s hard to write the pandas functions into the disk, while your memory is not enough.
https://yaoyaoustc.github.io/2019/11/21/GradientTree/#more
Gradient Boosting Tree Basic start with a weak and simpler learner (e.g. mean()), get a prediction use a lost function J to compute the error between y_true and y_predict
https://yaoyaoustc.github.io/2019/11/25/MelRental-p2/
While we’re waiting for the data, let’s do some data visualization, take one day’s rental data as an example.load library123456import pandas as pdimport numpy as npimport matplotlib.pyplot as pltimpor
https://yaoyaoustc.github.io/2020/03/31/pyspark-dataframe-01/#more
Pyspark Databricks ExerciseThe purpose of this series is to provide some exercise of using pyspark on a cloud platform. Pros:Load library and create a handler for spark with SparkSession
https://yaoyaoustc.github.io/2019/11/14/Tecent2019c/#more
Just another day of Data cleaning… this dataset really required lots and lots of cleaning…Data prepare: continuedFrom last notebook we’ve get the correct label: next 24 hour exposure rate.
https://yaoyaoustc.github.io/2019/11/19/tips-multionehot/#more
In data science data cleaning stage, we may encounter this situation: In one column you have multiple features, each feature has multiple values, they all stacked in one column.
https://yaoyaoustc.github.io/about/
AboutLearning sometimes can be tedious and boring (or it always be?),so I need a place to share some thoughts, write down some learning notes, or just make some memes.
https://yaoyaoustc.github.io/2019/11/27/LR-metrics/
Let’s put a summary of the common linear regression evaluation metrics. What they mean and how to use them.Mean Absolute Error (MAE)Meaning:you should expect you predictions are off MAE from the true
https://yaoyaoustc.github.io/2019/11/21/SVM-OR-CART/#more
I think there is no solid evidence to prove which is better than another. These two algorithms build from different methods with different hyper-parameters to tune. Therefore, I think the right approa
https://yaoyaoustc.github.io/2020/03/24/covid19/#more
1234567import pandas as pdimport numpy as npimport matplotlib.pyplot as pltimport seaborn as snsimport matplotlib.ticker as tickerimport calendar%matplotlib inline
https://yaoyaoustc.github.io/2019/10/31/Tree-class-comparisons/
A simple comparisons between tree classifiers
https://yaoyaoustc.github.io/2019/11/12/MelRental/
Before anythingHow to convert a jupyter notebook to md file?1$ jupyter nbconvert --to markdown.md input.ipynbIntroductionSeeking for rentals in a limited time is always a painful experience. For me, b
https://yaoyaoustc.github.io/2019/11/12/jupyter-demo/P1_Scrap/
Melbourne rental project part 1 webspider | Yao's rabbit hole
https://yaoyaoustc.github.io/2019/10/31/Tree-class-comparisons/#comments
Tree class comparisons | Lazy Machine Learning Learner
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.