Coder Social home page Coder Social logo

hype's People

Contributors

cgravill avatar dsyme avatar gbaydin avatar gitter-badger avatar smoothdeveloper avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hype's Issues

Usage example of RNN with large inputs

Hi,

I would like to use large inputs for training a phrase generating model and I tried to adapt the example on the website but of course I get an OutOfMemoryException

What would be a good approach to this kind of task?

Thanks,
Adrian

How are gradients for weights and biases calculated for individual layers?

Hi. I am lost after the following part.

let func = fun w x -> 
                    l.Decode w
                    l.Run x
        let w0 = l.Encode()
        let w, loss, _, lhist = Optimize.Train(func, w0, dataset, valid, par)

We send func to be optimized w.r.t dataset. But, Optimize.Train method have no idea about individual layers for FeedForward network.

In Optimize module I guess the whole network is flattened to be a single logistic regression function, hence the vector form of weights (probably I am wrong). Finally, where do biases get updated?

Not: I am deeply sorry that I ask these questions on issues board but the only person who can answer these question is the person who wrote the library. I have been trying to write Contrastive Divergence algorithm, but stuck in this particular part of Hype.

Untraceable IndexOutOfRangeException.

Here is my simple autoencoder code

type Activation = 
        |Sigm
        |Softmax
        |Linear
        |Tanh

        member this.funDM =
            match this with
            |Sigm -> DM.Sigmoid
            |Softmax -> DM.mapCols DV.SoftMax
            |Tanh -> DM.Tanh
            |Linear -> id

        member this.funDV =
            match this with
            |Sigm -> DV.Sigmoid
            |Softmax -> DV.SoftMax
            |Tanh -> DV.Tanh
            |Linear -> id

let inline ( *+) dv f = DV.Append(dv, toDV [|f|])

/// i -> number of inputs  h -> number of hidden units
type AutoEncoder(i,h,a:Activation) =

    // Hepler functions
    let removeBias (X:DM) = X.[0..X.Rows-2, *]
    let replaceBias (X:DM) = X.[0..X.Rows-2, *] |> DM.appendRow (DV.create X.Cols 1)
    let appendBiasDM (X:DM) = DM.appendRow (DV.create X.Cols 1) X

    /// Weights W:DM . Since it has tied weights both layers use the same weights one transpose of the other.
    member val W = Rnd.NormalDM(h+1, i+1, D 0.f, D 0.1f) with get, set

    /// Flattened weights W':DV
    member this.W' with get() = DM.toDV this.W and set W' = this.W <- DV.toDM (h+1) W'

    /// Forward propagate the data
    member this.RunDM (X:DM) = let h = X |> appendBiasDM |> (*) this.W |> replaceBias
                               (DM.Transpose this.W) * h |> removeBias

    /// Forward propagation when W' provided by optimization algorithm
    member this.Run (W':DV) (X:DM) = this.W' <- W'
                                     this.RunDM X

    /// Encode data and get hidden unit
    member this.Encode (X:DM) = X |> appendBiasDM |> (*) this.W |> replaceBias

// TEST
let ae = AutoEncoder(3,2, Activation.Sigm)
let p = {Params.Default with Regularization = NoReg; Loss = Loss.Quadratic }
let X' = ((toDM [[1.f;5.f;2.f];[8.f;2.f;2.f];[1.f;5.f;2.f];[8.f;2.f;2.f];
                [1.1f;5.2f;2.f];[8.1f;2.1f;2.f];[0.9f;4.9f;2.f];[7.9f;1.9f;2.f]])) / 10 |> DM.Transpose

let ds' = Dataset(X', X'.Copy())
let a,b,_,_ = Optimize.Train(ae.Run , ae.W', ds', p)

This code produces the following error.

System.IndexOutOfRangeException: Index was outside the bounds of the array.
at DiffSharp.AD.Float32.tupledArg_2@2296-39(Int32 j, Int32 i, DM b, Single[,] a, Single[,] bb)
at DiffSharp.AD.Float32.DM.AddSubMatrix(DM a, Int32 i, Int32 j, DM b)
at DiffSharp.AD.Float32.DOps.pushRec@2968(FSharpList1 ds) at Hype.Optimize.Train(FSharpFunc2 f, DV w0, Dataset d, Dataset v, Params par)
at Hype.Optimize.Train(FSharpFunc`2 f, DV w0, Dataset d, Params par)

I also tried to debug with source code of Hype and DiffSharp but couldn't figure out where things got wrong.

Array2D.copy is slow, you can use array.Clone

The perf of the recurrent neural networks sample is dominated by Array2D.copy as part of DM addiiton as part of pushRec in DiffSharp.

  1. It might be possible to use in-place addition. I'm not entirely sure which addition operation in pushRec is dominating, but in cases like this it looks like in-place addition might be appropriate:

                    d.A <- d.A + (v :?> DM)
    
  2. Regardless, you're using Array2D.copy in DiffSharp and that seems to be slower than it should be since it does an initBased. That should be fixed in FSharp.Core. But in the meantime you can do this which seems 4x faster.

module Array2D =  
    let copyFast (array : 'T[,]) =  array.Clone() :?> 'T[,]

e.g.

#time "on"

let test1() = 
    let mutable res  = Array2D.zeroCreate<float32> 100 100 
    for i in 0 .. 1000 do   
        for j in 0 .. 100 do
            res <- Array2D.copy res

let test2() = 
    let mutable res  = Array2D.zeroCreate<float32> 100 100 
    for i in 0 .. 1000 do   
        for j in 0 .. 100 do
            res <- Array2D.copyFast res

test1()  // 4.4s
test2() // 0.98s

Bias terms are also regularized.

match r with
        | L1Reg l -> fun (w:DV) -> l * (DV.l1norm w)
        | L2Reg l -> fun w -> l * (DV.l2normSq w)
        | NoReg -> fun _ -> D 0.f

This regularization methods penalizes bias terms, as well. Bias terms should be excluded from being penalized.

Move // comments to be ///

The library is light on comments (though there are some! And the code is readable).

One easy way to improve things is to move // comments like these

| Constant    of D         // Constant
| Decay       of D * D     // 1 / t decay, a = a0 / (1 + kt). Initial value, decay rate
| ExpDecay    of D * D     // Exponential decay, a = a0 * Exp(-kt). Initial value, decay rate

to be /// commments like these

/// Constant learning rate
| Constant    of D         

/// Learning rate of 1 / t decay, a = a0 / (1 + kt). Initial value, decay rate
| Decay       of D * D    

/// Exponential decay learning rate, a = a0 * Exp(-kt). Initial value, decay rate
| ExpDecay    of D * D  

etc. Since you've got the comments there already you may as well propagate them to the user through /// comments. In general having a /// comment on every type, member, union case and record field is a good thing.

The same applies to DiffSharp - again readable once you know the AD techniques but more comments would be helpful, for example union cases D and DM etc. don't have commments

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.