hypelib / hype Goto Github PK

View Code? Open in Web Editor NEW

143.0 143.0 27.0 1.3 MB

Hype: Compositional Machine Learning and Hyperparameter Optimization

Home Page: http://hypelib.github.io/Hype/

License: MIT License

F# 100.00%

hype's People

Contributors

Stargazers

Watchers

hype's Issues

Bias terms are also regularized.

match r with
        | L1Reg l -> fun (w:DV) -> l * (DV.l1norm w)
        | L2Reg l -> fun w -> l * (DV.l2normSq w)
        | NoReg -> fun _ -> D 0.f

This regularization methods penalizes bias terms, as well. Bias terms should be excluded from being penalized.

Move // comments to be ///

The library is light on comments (though there are some! And the code is readable).

One easy way to improve things is to move // comments like these

| Constant    of D         // Constant
| Decay       of D * D     // 1 / t decay, a = a0 / (1 + kt). Initial value, decay rate
| ExpDecay    of D * D     // Exponential decay, a = a0 * Exp(-kt). Initial value, decay rate

to be /// commments like these

/// Constant learning rate
| Constant    of D         

/// Learning rate of 1 / t decay, a = a0 / (1 + kt). Initial value, decay rate
| Decay       of D * D    

/// Exponential decay learning rate, a = a0 * Exp(-kt). Initial value, decay rate
| ExpDecay    of D * D

etc. Since you've got the comments there already you may as well propagate them to the user through /// comments. In general having a /// comment on every type, member, union case and record field is a good thing.

The same applies to DiffSharp - again readable once you know the AD techniques but more comments would be helpful, for example union cases D and DM etc. don't have commments

Usage example of RNN with large inputs

Hi,

I would like to use large inputs for training a phrase generating model and I tried to adapt the example on the website but of course I get an OutOfMemoryException

What would be a good approach to this kind of task?

Thanks,
Adrian

How are gradients for weights and biases calculated for individual layers?

Hi. I am lost after the following part.

let func = fun w x -> 
                    l.Decode w
                    l.Run x
        let w0 = l.Encode()
        let w, loss, _, lhist = Optimize.Train(func, w0, dataset, valid, par)

We send func to be optimized w.r.t dataset. But, Optimize.Train method have no idea about individual layers for FeedForward network.

In Optimize module I guess the whole network is flattened to be a single logistic regression function, hence the vector form of weights (probably I am wrong). Finally, where do biases get updated?

Not: I am deeply sorry that I ask these questions on issues board but the only person who can answer these question is the person who wrote the library. I have been trying to write Contrastive Divergence algorithm, but stuck in this particular part of Hype.

Untraceable IndexOutOfRangeException.

Here is my simple autoencoder code

type Activation = 
        |Sigm
        |Softmax
        |Linear
        |Tanh

        member this.funDM =
            match this with
            |Sigm -> DM.Sigmoid
            |Softmax -> DM.mapCols DV.SoftMax
            |Tanh -> DM.Tanh
            |Linear -> id

        member this.funDV =
            match this with
            |Sigm -> DV.Sigmoid
            |Softmax -> DV.SoftMax
            |Tanh -> DV.Tanh
            |Linear -> id

let inline ( *+) dv f = DV.Append(dv, toDV [|f|])

/// i -> number of inputs  h -> number of hidden units
type AutoEncoder(i,h,a:Activation) =

    // Hepler functions
    let removeBias (X:DM) = X.[0..X.Rows-2, *]
    let replaceBias (X:DM) = X.[0..X.Rows-2, *] |> DM.appendRow (DV.create X.Cols 1)
    let appendBiasDM (X:DM) = DM.appendRow (DV.create X.Cols 1) X

    /// Weights W:DM . Since it has tied weights both layers use the same weights one transpose of the other.
    member val W = Rnd.NormalDM(h+1, i+1, D 0.f, D 0.1f) with get, set

    /// Flattened weights W':DV
    member this.W' with get() = DM.toDV this.W and set W' = this.W <- DV.toDM (h+1) W'

    /// Forward propagate the data
    member this.RunDM (X:DM) = let h = X |> appendBiasDM |> (*) this.W |> replaceBias
                               (DM.Transpose this.W) * h |> removeBias

    /// Forward propagation when W' provided by optimization algorithm
    member this.Run (W':DV) (X:DM) = this.W' <- W'
                                     this.RunDM X

    /// Encode data and get hidden unit
    member this.Encode (X:DM) = X |> appendBiasDM |> (*) this.W |> replaceBias

// TEST
let ae = AutoEncoder(3,2, Activation.Sigm)
let p = {Params.Default with Regularization = NoReg; Loss = Loss.Quadratic }
let X' = ((toDM [[1.f;5.f;2.f];[8.f;2.f;2.f];[1.f;5.f;2.f];[8.f;2.f;2.f];
                [1.1f;5.2f;2.f];[8.1f;2.1f;2.f];[0.9f;4.9f;2.f];[7.9f;1.9f;2.f]])) / 10 |> DM.Transpose

let ds' = Dataset(X', X'.Copy())
let a,b,_,_ = Optimize.Train(ae.Run , ae.W', ds', p)

This code produces the following error.

System.IndexOutOfRangeException: Index was outside the bounds of the array.
at DiffSharp.AD.Float32.tupledArg_2@2296-39(Int32 j, Int32 i, DM b, Single[,] a, Single[,] bb)
at DiffSharp.AD.Float32.DM.AddSubMatrix(DM a, Int32 i, Int32 j, DM b)
at DiffSharp.AD.Float32.DOps.pushRec@2968(FSharpList1 ds) at Hype.Optimize.Train(FSharpFunc2 f, DV w0, Dataset d, Dataset v, Params par)
at Hype.Optimize.Train(FSharpFunc`2 f, DV w0, Dataset d, Params par)

I also tried to debug with source code of Hype and DiffSharp but couldn't figure out where things got wrong.

Array2D.copy is slow, you can use array.Clone

The perf of the recurrent neural networks sample is dominated by Array2D.copy as part of DM addiiton as part of pushRec in DiffSharp.

It might be possible to use in-place addition. I'm not entirely sure which addition operation in pushRec is dominating, but in cases like this it looks like in-place addition might be appropriate:
```
                d.A <- d.A + (v :?> DM)
```
Regardless, you're using Array2D.copy in DiffSharp and that seems to be slower than it should be since it does an initBased. That should be fixed in FSharp.Core. But in the meantime you can do this which seems 4x faster.

module Array2D =  
    let copyFast (array : 'T[,]) =  array.Clone() :?> 'T[,]

e.g.

#time "on"

let test1() = 
    let mutable res  = Array2D.zeroCreate<float32> 100 100 
    for i in 0 .. 1000 do   
        for j in 0 .. 100 do
            res <- Array2D.copy res

let test2() = 
    let mutable res  = Array2D.zeroCreate<float32> 100 100 
    for i in 0 .. 1000 do   
        for j in 0 .. 100 do
            res <- Array2D.copyFast res

test1()  // 4.4s
test2() // 0.98s

hypelib / hype Goto Github PK

hype's People

Contributors

Stargazers

Watchers

Forkers

hype's Issues

Bias terms are also regularized.

Move // comments to be ///

Usage example of RNN with large inputs

How are gradients for weights and biases calculated for individual layers?

Untraceable IndexOutOfRangeException.

Array2D.copy is slow, you can use array.Clone

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent