Coder Social home page Coder Social logo

imputation's Introduction

A consideration of the use of XGBoost as a replacement technique for imputation of missing values in official statistics.

Background

This work was originally undertaken as part of a Data Science Academy project. A scheme internal to the UK's Office for National Statistics to allow its staff to do short (2 weeks) projects into machine learning techniques applied into domains which are known to the mentee supervised by members of @datasciencecampus.

This project was created for @Vinayak-NZ.

Contents

Techniques compared include:

  • XGBoost
  • CANCEIS
  • RBEIS
  • Mixed methods

Future work is currently ongoing outside of this repository for other approaches which includes multiple imputation and a consideration of the use of more sophisticated techniques (e.g. autoencoders).

There is also an alternate workstream considering the use of genetic algorithms for this type of work. This is an Academy project for a separate member of the imputation methodology team.

Future plans include a methodological consideration of the suitability of these techniques in practice and to provide a more abstract consideration of what the most suitable mechanisms for this would be.

The project write up is available on the gh-pages branch and there is the presentation given after completion of the work on the presentation branch.

imputation's People

Contributors

vinayak-nz avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.