hypelib / hype Goto Github PK
View Code? Open in Web Editor NEWHype: Compositional Machine Learning and Hyperparameter Optimization
Home Page: http://hypelib.github.io/Hype/
License: MIT License
Hype: Compositional Machine Learning and Hyperparameter Optimization
Home Page: http://hypelib.github.io/Hype/
License: MIT License
Hi,
I would like to use large inputs for training a phrase generating model and I tried to adapt the example on the website but of course I get an OutOfMemoryException
What would be a good approach to this kind of task?
Thanks,
Adrian
Hi. I am lost after the following part.
let func = fun w x ->
l.Decode w
l.Run x
let w0 = l.Encode()
let w, loss, _, lhist = Optimize.Train(func, w0, dataset, valid, par)
We send func
to be optimized w.r.t dataset
. But, Optimize.Train
method have no idea about individual layers for FeedForward network.
In Optimize module I guess the whole network is flattened to be a single logistic regression function, hence the vector form of weights (probably I am wrong). Finally, where do biases get updated?
Not: I am deeply sorry that I ask these questions on issues board but the only person who can answer these question is the person who wrote the library. I have been trying to write Contrastive Divergence algorithm, but stuck in this particular part of Hype.
Here is my simple autoencoder code
type Activation =
|Sigm
|Softmax
|Linear
|Tanh
member this.funDM =
match this with
|Sigm -> DM.Sigmoid
|Softmax -> DM.mapCols DV.SoftMax
|Tanh -> DM.Tanh
|Linear -> id
member this.funDV =
match this with
|Sigm -> DV.Sigmoid
|Softmax -> DV.SoftMax
|Tanh -> DV.Tanh
|Linear -> id
let inline ( *+) dv f = DV.Append(dv, toDV [|f|])
/// i -> number of inputs h -> number of hidden units
type AutoEncoder(i,h,a:Activation) =
// Hepler functions
let removeBias (X:DM) = X.[0..X.Rows-2, *]
let replaceBias (X:DM) = X.[0..X.Rows-2, *] |> DM.appendRow (DV.create X.Cols 1)
let appendBiasDM (X:DM) = DM.appendRow (DV.create X.Cols 1) X
/// Weights W:DM . Since it has tied weights both layers use the same weights one transpose of the other.
member val W = Rnd.NormalDM(h+1, i+1, D 0.f, D 0.1f) with get, set
/// Flattened weights W':DV
member this.W' with get() = DM.toDV this.W and set W' = this.W <- DV.toDM (h+1) W'
/// Forward propagate the data
member this.RunDM (X:DM) = let h = X |> appendBiasDM |> (*) this.W |> replaceBias
(DM.Transpose this.W) * h |> removeBias
/// Forward propagation when W' provided by optimization algorithm
member this.Run (W':DV) (X:DM) = this.W' <- W'
this.RunDM X
/// Encode data and get hidden unit
member this.Encode (X:DM) = X |> appendBiasDM |> (*) this.W |> replaceBias
// TEST
let ae = AutoEncoder(3,2, Activation.Sigm)
let p = {Params.Default with Regularization = NoReg; Loss = Loss.Quadratic }
let X' = ((toDM [[1.f;5.f;2.f];[8.f;2.f;2.f];[1.f;5.f;2.f];[8.f;2.f;2.f];
[1.1f;5.2f;2.f];[8.1f;2.1f;2.f];[0.9f;4.9f;2.f];[7.9f;1.9f;2.f]])) / 10 |> DM.Transpose
let ds' = Dataset(X', X'.Copy())
let a,b,_,_ = Optimize.Train(ae.Run , ae.W', ds', p)
This code produces the following error.
System.IndexOutOfRangeException: Index was outside the bounds of the array.
at DiffSharp.AD.Float32.tupledArg_2@2296-39(Int32 j, Int32 i, DM b, Single[,] a, Single[,] bb)
at DiffSharp.AD.Float32.DM.AddSubMatrix(DM a, Int32 i, Int32 j, DM b)
at DiffSharp.AD.Float32.DOps.pushRec@2968(FSharpList1 ds) at Hype.Optimize.Train(FSharpFunc
2 f, DV w0, Dataset d, Dataset v, Params par)
at Hype.Optimize.Train(FSharpFunc`2 f, DV w0, Dataset d, Params par)
I also tried to debug with source code of Hype and DiffSharp but couldn't figure out where things got wrong.
The perf of the recurrent neural networks sample is dominated by Array2D.copy
as part of DM addiiton as part of pushRec
in DiffSharp.
It might be possible to use in-place addition. I'm not entirely sure which addition operation in pushRec
is dominating, but in cases like this it looks like in-place addition might be appropriate:
d.A <- d.A + (v :?> DM)
Regardless, you're using Array2D.copy in DiffSharp and that seems to be slower than it should be since it does an initBased
. That should be fixed in FSharp.Core. But in the meantime you can do this which seems 4x faster.
module Array2D =
let copyFast (array : 'T[,]) = array.Clone() :?> 'T[,]
e.g.
#time "on"
let test1() =
let mutable res = Array2D.zeroCreate<float32> 100 100
for i in 0 .. 1000 do
for j in 0 .. 100 do
res <- Array2D.copy res
let test2() =
let mutable res = Array2D.zeroCreate<float32> 100 100
for i in 0 .. 1000 do
for j in 0 .. 100 do
res <- Array2D.copyFast res
test1() // 4.4s
test2() // 0.98s
match r with
| L1Reg l -> fun (w:DV) -> l * (DV.l1norm w)
| L2Reg l -> fun w -> l * (DV.l2normSq w)
| NoReg -> fun _ -> D 0.f
This regularization methods penalizes bias terms, as well. Bias terms should be excluded from being penalized.
The library is light on comments (though there are some! And the code is readable).
One easy way to improve things is to move //
comments like these
| Constant of D // Constant
| Decay of D * D // 1 / t decay, a = a0 / (1 + kt). Initial value, decay rate
| ExpDecay of D * D // Exponential decay, a = a0 * Exp(-kt). Initial value, decay rate
to be ///
commments like these
/// Constant learning rate
| Constant of D
/// Learning rate of 1 / t decay, a = a0 / (1 + kt). Initial value, decay rate
| Decay of D * D
/// Exponential decay learning rate, a = a0 * Exp(-kt). Initial value, decay rate
| ExpDecay of D * D
etc. Since you've got the comments there already you may as well propagate them to the user through ///
comments. In general having a ///
comment on every type, member, union case and record field is a good thing.
The same applies to DiffSharp - again readable once you know the AD techniques but more comments would be helpful, for example union cases D
and DM
etc. don't have commments
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.