Coder Social home page Coder Social logo

buxton-challenge-2014's Introduction

buxton-challenge-2014

These R files contain statistical models that I built for a predictive analytics challenge in November 2014, sponsored by Buxton and the Wharton Customer Analytics Initiative. The challenge was to predict the sales revenue of potential new store locations of a fast food chain, given a data set with sales and over 700 features for 300 existing locations. Notably, this is a "p > N" setting which breaks standard regression models. I used statistical learning methods to handle this issue robustly, while also satisfying model interpretability requirements in the competition rules. My team and I communicated our methods and recommendations to a "non-technical" panel of judges who roleplayed the chain's executive leadership.

Description of Files

ElasticNet.R fits an Elastic Net (R package glmnet), which performs shrinkage and variable selection via L1-L2 regularization to generate a multiple regression model. I also wrote code for scrubbing, since the data set had missing and mislabeled values in some places, and implemented cross-validation to optimize predictive accuracy.

LocalLASSO.R implements a semiparametric model which fits a weighted multiple regression (LASSO) for each existing store, with the weights given by a composite kernel capturing demographic similarities between pairs of stores. (The intuition is that the set of variables most important for revenue may depend on a store location's demographics.) The sales revenue prediction for each potential store is a local average of the predictions given by the models associated with the k-most-similar existing stores, again using the kernel as a similarity metric.

ComputeBhatt.R outputs five .csv files, each containing a matrix of the Bhattacharyya distances between pairs of stores for one of the five demographic categories (Age, Income, Household Size, Family Size, and Education). These are combined to form the composite kernel in LocalLASSO.R.

Due to a nondiclsoure agreement, the actual data set is unavailable. My team's presentation and write-up are available by request; please email [email protected].

buxton-challenge-2014's People

Contributors

sirallen avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.