Coder Social home page Coder Social logo

blog-for-adaboost's Introduction

Blog-for-adaboost

This is a material index for writing a blog about adaboost.

http://www.52caml.com/head_first_ml/ml-chapter6-boosting-family/ 第06章:深入浅出ML之Boosting家族,但可以想象的是,随着spark的进一步发展,该分布式计算平台会变的非常重,功能也会越来越多。离专注、专一和极致的解决某类问题越来越远,对每一类问题给出的解决方案并不会特别好。

http://chuansong.me/n/1146911642859 MXNet专栏 | 李沐:深度学习·炼丹入门

http://geek.csdn.net/news/detail/201207 揭秘Kaggle神器xgboost

http://blog.csdn.net/heyongluoyao8/article/details/49408131 在分类中如何处理训练集中不平衡问题,这里提到了一种人工数据生成办法,SMOTE

https://www.zhihu.com/topic/19719221/hot adaboost,提到泛化误差的问题,讨论了很多。

虽然鲍捷老师一直在倡导智能金融,并身体力行,但我心里一直嘀咕的是,这事不好办。

前一段时间,有道翻译还蛮靠谱的,但现在也不行了,国内学者或者学生越来越喜欢用英文写作了,从国外搬砖,不得不翻译成中文的,如今和向国内学者或学生学习都得用英文了, 这事就累得慌了。有段时间我再想,有道这么厉害,以后不用学英语了,但现在想明白了,是可以不用学英语了,但得交钱。

周志华老师《机器学习》关于adaboost一节介绍的不多,但在25th adaboost讲了很多。

https://cseweb.ucsd.edu/classes/fa01/cse291/AdaBoost.pdf 2001年Freund and Schaipre 写的简介。

http://math.mit.edu/~rothvoss/18.304.3PM/Presentations/1-Eric-Boosting304FinalRpdf.pdf MIT adaboost讲义。

http://mccormickml.com/2013/12/13/adaboost-tutorial/ 有一点内容,不重要。

http://www.multiboost.org/ 多分类boost软件。

xgboost,最早用在300k数据集上,用randomforest直接卡死,用xgboost直接通过,很surprise。

https://github.com/luispedro/milk 一款python和c++加速的adaboost软件。

http://totoharyanto.staff.ipb.ac.id/files/2012/10/Building-Machine-Learning-Systems-with-Python-Richert-Coelho.pdf 和上面是同一个作者。

https://www.zhihu.com/question/23003213 回头要看看python和C+要怎样混合编程。

Nov 13th,2017

https://www.zhihu.com/topic/20035241/hot 关于xgboost的讨论,马超讲了vc维,周在《机器学习》计算学习理论一章里讲到vc、Rademacher复杂度、 稳定性。马超:而特征的多样性也正是为什么工业界很少去使用 svm 的一个重要原因之一?

https://github.com/TracyMcgrady6 马超的github?

https://www.jiqizhixin.com/articles/2017-11-08-3 为什么XGBoost在机器学习竞赛中表现如此卓越?挪威科技大学 Didrik Nielsen 的硕士论文《使用 XGBoost 的树提升:为什么 XGBoost 能赢得「每一场」机器学习竞赛?(Tree Boosting With XGBoost - Why Does XGBoost Win "Every" Machine Learning Competition?)》

http://www.cs.columbia.edu/~kathy/cs4701/documents/jason_svm_tutorial.pdf 搞了半天,还得弄明白VC维

https://github.com/vividfree/alphabet 罗维的github,很多内容,我想知道怎么写C++?博客也很好,就是乱了一点。

http://txshi-mt.com/2017/08/20/NTUML-7-the-VC-Dimension/ 关于VC维,有点难

http://web.cecs.pdx.edu/~mm/MachineLearningWinter2017/EnsembleLearning.pdf 集成学习示例,比较详细

http://web.cecs.pdx.edu/~mm/MachineLearningSpring2017/EnsembleLearningExerciseSolutions.pdf 上例中计算解释

http://lamda.nju.edu.cn/MainPage.ashx 南大机器学习与数据挖掘研究所网站

https://petolau.github.io/Ensemble-of-trees-for-forecasting-time-series/ 集成学习在电力消费上的预测

http://www.liaad.up.pt/area/jgama/InfSys2017.pdf 集成学习在流数据分析的应用

http://docs.w3cub.com/scikit_learn/auto_examples/ensemble/plot_adaboost_hastie_10_2/ Discrete versus Real AdaBoost

https://www.leiphone.com/news/201707/JSAaQzOebhHHKuTN.html 如何用自动机器学习实现神经网络进化

https://www.leiphone.com/news/201705/NlTc7oObBqh116Z5.html 南京大学俞扬博士万字演讲全文:强化学习前沿(上)

https://github.com/xiahouzuoxin/notes/tree/master/essays 有一些无聊的东西

http://mnews.onlinedown.net/trends/83553.html 微软研院的时空数据分析,很好的资料

https://www.analyticsvidhya.com/blog/2016/02/complete-guide-parameter-tuning-gradient-boosting-gbm-python/ 这里提到boosting确实可以减少方差和偏差。

http://adataanalyst.com/machine-learning/adaboost-python-3/ 介绍了adaboost的python实现

https://github.com/nathanntg/adaboost adaboost复杂实现

https://github.com/alinagithub/Adaboost/blob/master/AdaBoost.py 看不懂实现

https://machinelearningmastery.com/gentle-introduction-gradient-boosting-algorithm-machine-learning/ 很多典故

http://xxuan.me/2017-03-24-adaboost.html 中文adaboost,有python代码

https://www.zhihu.com/question/29138020 如何自学python 这个厉害 还有南洋理工大学的博士 分享 也很厉害

https://github.com/lijin-THU 还不错,介绍python和ml

http://www.jianshu.com/u/212ef8ed6ac2 爬虫或python必备

Nov 14th,2017

https://www.leiphone.com/news/201709/QuBqR1h6I7yglwbH.html 周志华CAIS大会的演讲

https://www.leiphone.com/news/201703/CC6gwUhEAPuG1kvN.html 周志华gcforest

https://www.douban.com/doulist/3440234/ 台大机器学习基石

http://blog.csdn.net/wzmsltw/article/details/51039928 周志华《机器学习》一些python代码

https://pan.baidu.com/s/1qYRMLvY#list/path=%2F 周志华《机器学习》讲义

http://www.michael-noll.com/tutorials/writing-an-hadoop-mapreduce-program-in-python/ 在hadoop上实现mapreduce

https://docs.djangoproject.com/en/1.11/intro/tutorial01/ django练习,完成hello world!

Nov 15th, 2017

http://decisiontrees.net/decision-trees-and-data-mining-software/ 想找一些决策树的源码

http://gabrielelanaro.github.io/blog/2016/03/03/decision-trees.html 简洁、清晰

https://github.com/revantkumar/Decision-Tree 这个ID3蛮好

https://github.com/chrisspen/dtree 决策树,复杂一点

http://www.cs.uvm.edu/~icdm/algorithms/10Algorithms-08.pdf 十大算法的论文

Nov 16th, 2017

http://people.duke.edu/~ccc14/sta-663/EMAlgorithm.html 了解一下如何实现EM,这个网站很好,提供了python的计算统计学

https://zhuanlan.zhihu.com/p/28249050 Jensen’s inequality 不等式的理解

Nov 17th,2017

http://www.ruanyifeng.com/blog/2015/02/make.html 阮一峰的make介绍

http://www.stat.columbia.edu/~gelman/research/unpublished/stan-resubmit-JSS1293.pdf duke大学提到哥大的stan,里面的抽样技术很吸引人

周志华《机器学习》P153 朴素贝叶斯分类器,拉普拉斯修正避免了因训练样本不充分而导致概率估值为零的问题

https://wenku.baidu.com/view/5799a87302768e9951e73890.html 贝叶斯方法介绍 85p

http://ai.stanford.edu/~chuongdo/papers/em_tutorial.pdf EM算法普及

http://blog.csdn.net/u011300443/article/details/46763743 解释了上文里有些概率是怎么产生的

http://tinyheero.github.io/2016/01/03/gmm-em.html 写的很好

http://www.math.leidenuniv.nl/~avdvaart/ASC/EM.pdf

https://math.usask.ca/~longhai/teaching/stat812-1409/rdemo/EM_examples.pdf

Nov 20th,2017

http://www.cnblogs.com/jerrylead/archive/2011/04/06/2006936.html 讲了一些EM,抄书侠,但是很翔实,不像有些人吹牛

偏差-方差分解说明,泛化能力是由学习算法的能力、数据的充分性以及学习任务本身的难度共同决定的。

https://www.codecogs.com/latex/eqneditor.php 在线编辑公式用这个

Nov 21th, 2017

https://www.projectrhea.org/rhea/index.php/MLE_Examples:_Binomial_and_Poisson_Distributions_OldKiwi 介绍MLE,求导有错误

https://stats.stackexchange.com/questions/181035/how-to-derive-the-likelihood-function-for-binomial-distribution-for-parameter-es 介绍MLE

https://tech.meituan.com/deep-understanding-of-ffm-principles-and-practices.html 美团技术团队评价FFM

https://stats.stackexchange.com/questions/181035/how-to-derive-the-likelihood-function-for-binomial-distribution-for-parameter-es 推导binomial log likelihood

https://stackoverflow.com/questions/8936099/returning-multiple-objects-in-an-r-function 怎么用R返回多个对象,用S4的想法挺不错

Nov 23th,2017

https://github.com/ShifuML/shifu 这个架在hadoop yarn的工具好八辈

https://wenku.baidu.com/view/49e82037e518964bcf847c81.html 一些离散型分布的介绍

http://www.jianshu.com/p/38c13be59137 beta-binomial分布,机器学习分布笔记

https://max.book118.com/html/2017/0314/95370569.shtm kumaraswamy binomial分布,和beta-binomial分布一致,都是讨论事件有依赖的情况

Nov 24th,2017

https://github.com/grigio/vim-sublime vim和sublime编辑器

https://www.pyimagesearch.com/2016/10/17/stochastic-gradient-descent-sgd-with-python/ SGD的python实现

《机器学习》中,提到的优化方法有5种,但并没有说怎么选择优化方法,难不成是当时的人自己选择的?

Nov 27th,2017

https://www.quora.com/Whats-the-difference-between-gradient-descent-and-stochastic-gradient-descent 介绍SGD,还不错

https://github.com/airoldilab/sgd 又看到Dustin Tran,这哥么好惊人,airoldilab也好惊人

http://applied.stat.harvard.edu/ airoldilab 众神归位

https://github.com/edwardlib/observations 几个月不看edward,我只能一声“我 call"

https://www.zhihu.com/question/19894595 关于MLE,MAP,EM的各种讨论

https://cosx.org/2011/01/how-does-glm-generalize-lm-fit-and-test 从线性模型到广义线性模型!

https://wenku.baidu.com/view/7e25200d360cba1aa911da28.html GAM模型在车险保费定价上的应用,有一个分级的例子

http://chuansong.me/n/1783297842040 SGD作为贝叶斯后验推断方法

https://www2.stat.duke.edu/courses/Fall00/sta216/handouts/diagnostics.pdf duke大学关于fisher scoring的讲解,不是我需要的,已下载!

https://www.researchgate.net/publication/266599562_M_estimation_S_estimation_and_MM_estimation_in_robust_regression M estimation, S estimation, and MM estimation in robust regression 已下载

https://www.stat.berkeley.edu/~bartlett/courses/2013spring-stat210b/notes/15notes.pdf M_estimation和Z_estimation 有点难

https://socialsciences.mcmaster.ca/magee/761_762/other%20material/M-estimation.pdf 比较理论化,没下载

http://www4.ncsu.edu/~jack/robust_est6.pdf 随机矩阵理论的应用 已下载

http://www4.stat.ncsu.edu/~boos/papers/mest6.pdf 已下载

http://dept.stat.lsa.umich.edu/~moulib/emp-proc-notes-2.pdf 这个笔记太难了,已下载

http://data.princeton.edu/wws509/notes 这个不错哦

http://www.math.umd.edu/~bnk/Ch_GLMsld.pdf 时间序列GLM分析,已下载

http://www2.sas.com/proceedings/sugi27/p258-27.pdf 非常好的一篇,调查分析

Scalable estimation strategies based on stochastic approximations: Classical results and new insights 很完整,就是看不懂,已下载

http://www.stat.umn.edu/geyer/5931/mle/glm.pdf 极好的

https://stats.stackexchange.com/questions/176351/implement-fisher-scoring-for-linear-regression 有R代码,要研究一下,这里有海森矩阵的求逆 !

https://stats.stackexchange.com/questions/175882/why-fisher-scoring-is-easier-to-compute#comment333067_175882 上面的补充

https://userpages.umbc.edu/~gobbert/papers/RaimLiuNeerchalMorel2012.pdf 已下载!

https://gist.github.com/jtrive84/3517fd79f5959574b49f88cac3bf61ca fisher scoring R script

http://galton.uchicago.edu/~eichler/stat24600/Handouts/l02.pdf 理论,可以仔细看下!

http://www.stats.uwo.ca/faculty/bellhouse/Likelihood_Theory_with_Score_Function.pdf 已下载,极好, R script!

http://www.dbs.ifi.lmu.de/%7Etresp/papers/final_nips_fisher.pdf

https://nlp.stanford.edu/manning/courses/ling289/logistic.pdf

http://www.stats.ox.ac.uk/~steffen/teaching/bs2HT9/scoring.pdf 已下载

http://leon.bottou.org/publications/pdf/compstat-2010.pdf 理论

https://datascience.stackexchange.com/questions/664/fisher-scoring-v-s-coordinate-descent-for-mle-in-r 比较了fisherScoring和坐标下降法

Nov 28th,2017

http://logos.name/archives/187 讲到一点梯度法和牛顿法

http://deeplearning.net/tutorial/logreg.html 用LR方法求解mnist,好厉害

https://journal.r-project.org/archive/2009-2/RJournal_2009-2_Damico.pdf R survey 和 SAS STRATA...比较

http://xueshu.baidu.com/s?wd=paperuri%3A%28419032e9dfa8e822500a0b5d8d320a7d%29&filter=sc_long_sign&tn=SE_xueshusource_2kduw22v&sc_vurl=http%3A%2F%2Fwww.doc88.com%2Fp-1425469281665.html&ie=utf-8&sc_us=2430949782615356742 关于分层GLM

http://r-survey.r-forge.r-project.org/survey/index.html 这上面有一系列复杂抽样的材料

http://www.onedigit.co.uk/content/glm-generalised-linear-model-fisher-information-matrix !

http://statmath.wu.ac.at/courses/heather_turner/glmCourse_001.pdf !

https://baike.baidu.com/item/%E5%AF%B9%E6%95%B0%E5%87%BD%E6%95%B0 对数性质

Nov 29th,2017

https://onlinecourses.science.psu.edu/stat504/node/176 The Proportional-Odds Cumulative Logit Model

https://wenku.baidu.com/view/a914cfb6f705cc175427090b.html 抽样调查_比率估计

Dec 3rd,2017

http://www.stat.ufl.edu/~aa/ordinal/R_examples.pdf

https://rpubs.com/trjohns/survey-ratioreg

https://onlinecourses.science.psu.edu/stat504/node/177

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.