Coder Social home page Coder Social logo

ocramz / record-encode Goto Github PK

View Code? Open in Web Editor NEW
2.0 4.0 1.0 40 KB

Generic encoding of record types

License: BSD 3-Clause "New" or "Revised" License

Haskell 100.00%
data-science one-hot-encode categorical-data categorical-features data-mining data-analysis generic-programming preprocessing machine-learning

record-encode's Introduction

record-encode

Encoding categorical variables

Build Status Hackage

This library provides generic machinery to encode values of some algebraic type as points in a vector space.

Values of a sum type (e.g. enumerations) are also called "categorical" variables in statistics, because they encode a choice between a number of discrete categories.

On the other hand, many data science / machine learning algorithms rely on a purely numerical representation of data; the conversion code from values of a static type is often "boilerplate", i.e. largely repeated and not informative.

The encodeOneHot function provided here is a generic utility function (i.e. defined once and for all) to compute the one-hot representation of any sum type.

Usage example

    {-# language DeriveGeneric -#}

    import qualified GHC.Generics as G
    import qualified Generics.SOP as SOP
    
    import Data.Record.Encode

    data X = A | B | C deriving (G.Generic)
    instance SOP.Generic X
    > encodeOneHot B
    OH {oDim = 3, oIx = 1}

Please refer to the documentation of Data.Record.Encode for more examples and details.

Acknowledgements

Gagandeep Bhatia (@gagandeepb) for his Google Summer of Code 2018 work on Frames-beam, Mark Karpov (@mrkkrp) for his Template Haskell tutorial, Anthony Cowley (@acowley) for Frames, @mniip on Freenode #haskell for helping me better understand what can be done with generic programming.

record-encode's People

Contributors

ocramz avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.