Coder Social home page Coder Social logo

s6soverd / computational-statistics-project-onlasso Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 1.0 814 KB

Final Project from the course - Computational Statistics (Summer Term, 2020), University of Bonn

Jupyter Notebook 100.00%
lasso computational-statistics shrinkage high-dimensional-data

computational-statistics-project-onlasso's Introduction

Computational Statistics Course Project:

On the shrinkage method - LASSO, does the shrinkage of OLS parameters give improved prediction errors?

Exploring the essential quality of LASSO in setting parameters to zero under various simulatory contexts

Downloading and Viewing this project:

The best way to view this project is to view it from nbviewer/or mybinder badges above. As github can't read certain markdown syntax in jupyter notebook. Or even better, by downloading it from this github repository.

Briefly on the project:

This paper focuses on the theoretical properties of LASSO - an ๐‘™1 -norm regularization method, and drawing upon these theoretical properties, I have set up simulations that elucidate either visually or with the help of tables how LASSO acts in certain specified settings, and with which features it differs from OLS or Ridge - both being also linear regression methods. The structure of the paper is outlined in the "Table of Contents" below. The first section focuses briefly on the famous "Bias and Variance" tradeoff in Machine Learning, which builds the context on why I chose to focus on LASSO for this project paper. The second section explains why OLS fails in estimating parameters in high-dimensional data settings, what is high-dimensional and why LASSO is helpful in such settings are all accounted for in this section, as well. The proceeding Section sheds light on the fact that LASSO doesn't have a closed-form solution, but there are very few simplifed cases, where a LASSO estimation can be derived. The underlying subsection 3.1. shows in simulatory setting what happens when we shrink OLS parameters by very small random values. Questions like "Are we getting better predictions? Or do the variance of such models get improved? etc." are addressed in this set-up.

The Section 4, in passing, talks about the major difference and commonalities between Ridge and LASSO, before we utilize simulation set-ups that use also Ridge as a benchmark. In the subsection 4.1., with a simple linear regression model I show visually why Ridge only assymptotically gets close to zero, but LASSO can indeed set the parameter, in question, to zero. In the Section 5, it is shown when the some values of true ๐›ฝ vector are zero, on average, which of the linear regression methods are good at predicting them.

The rest of this project paper is comprised of 2 simulations that unravel the differences between Ridge and LASSO. Questions like "How the parameters of the covariates with differential variances are dealt by LASSO and Ridge?, "What about the parameters of correlated covariates?" are all adressed in these two sections.

References:

  1. Gauraha, N. (2018). Introduction to the LASSO. Reson, 23, 439โ€“464.

  2. Wieringen, W.N. (2015). Lecture notes on ridge regression. arXiv: Methodology.

  3. Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani. (2013). An introduction to statistical learning: with applications in R. New York: Springer.

  4. Belloni, Alexandre, Victor Chernozhukov, and Christian Hansen. (2014). "High-Dimensional Methods and Inference on Structural and Treatment Effects." Journal of Economic Perspectives, 28 (2): 29-50.

computational-statistics-project-onlasso's People

Contributors

s6soverd avatar

Watchers

 avatar

Forkers

nahmadova

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.