Comments (6)
@smoothdeveloper exactly. They removeBias and appendBiasDM alters sizes. One adds a row while the other removes it. However, they are intermediate operations.
The Run
method takes weight and input W'[m + k]
and X[m,n]
respectively. Then W'[m + k]
transformed to W[k, m]
. The output is also X[m,n]
. For this kind of pipeline, it is very common to add and remove bias terms as rows (or columns for that matter) to the wight matrix and treated as weights of additional input row which is always 1s.
ae.Run ae.W' (toDM [[0.1f;0.2f;0.3f];[0.2f;0.02f;0.02f];[0.9f;0.02f;0.02f];[0.9f;0.02f;0.02f]] |> DM.Transpose)
You can use this line to verify that input and output have the sime dimension. Intermediate operations are also compatible to each other because Autoencoder.W
is created as (o+1, i+1) which means addition of bias terms are considered.
About replaceBias
, it doesn't change dimension. It just simply writes 1s to the last row.
from hype.
To help with debug, I changed the ff
function to not be inline, it seems in this case j ++ jj
= 2 + 6
while aa.GetLength(1)
= 8
so it is out of bound.
The code in DOps.reversePush
was too scary for me to try to understand why j = 2
is passed.
from hype.
@smoothdeveloper still couldn't comprehend what is going on. I combined DiffSharp and Hype into one project, did what you suggested and debugged. It is extremely difficult for me to find at which point we get an unexpected value. And apparently the library is written by a mathematician=) thus, notations and naming makes it a little more difficult for me to debug easily.
from hype.
@zgrkpnr to debug, this is what I used, a .fsx file I've put in Hype/docs/input folder (just so you get the paths right) and I've built DiffSharp in debug (the library is in its own folder)
#r "../../../../DiffSharp/DiffSharp/src/DiffSharp/bin/Debug/DiffSharp.dll"
#r "../../src/Hype/bin/Release/Hype.dll"
open DiffSharp.AD.Float32
open Hype
type Activation =
|Sigm
|Softmax
|Linear
|Tanh
member this.funDM =
match this with
|Sigm -> DM.Sigmoid
|Softmax -> DM.mapCols DV.SoftMax
|Tanh -> DM.Tanh
|Linear -> id
member this.funDV =
match this with
|Sigm -> DV.Sigmoid
|Softmax -> DV.SoftMax
|Tanh -> DV.Tanh
|Linear -> id
let inline ( *+) dv f = DV.Append(dv, toDV [|f|])
type AutoEncoder(i,h,a:Activation) =
let removeBias (X:DM) = X.[0..X.Rows-2, *]
let replaceBias (X:DM) = X.[0..X.Rows-2, *] |> DM.appendRow (DV.create X.Cols 1)
let appendBiasDM (X:DM) = DM.appendRow (DV.create X.Cols 1) X
/// Weights W:DM . Since it has tied weights both layers use the same weights one transpose of the other.
member val W = Rnd.NormalDM(h+1, i+1, D 0.f, D 0.1f) with get, set
/// Flattened weights W':DV
member this.W' with get() = DM.toDV this.W and set W' = this.W <- DV.toDM (h+1) W'
/// Forward propagate the data
member this.RunDM (X:DM) = let h = X |> appendBiasDM |> (*) this.W |> replaceBias
(DM.Transpose this.W) * h |> removeBias
/// Forward propagation when W' provided by optimization algorithm
member this.Run (W':DV) (X:DM) = this.W' <- W'
this.RunDM X
/// Encode data and get hidden unit
member this.Encode (X:DM) = X |> appendBiasDM |> (*) this.W |> replaceBias
// TEST
let ae = AutoEncoder(3,2, Activation.Sigm)
let p = {Params.Default with Regularization = NoReg; Loss = Loss.Quadratic }
let X' = ((toDM [[1.f;5.f;2.f];[8.f;2.f;2.f];[1.f;5.f;2.f];[8.f;2.f;2.f];
[1.1f;5.2f;2.f];[8.1f;2.1f;2.f];[0.9f;4.9f;2.f];[7.9f;1.9f;2.f]])) / 10 |> DM.Transpose
let ds' = Dataset(X', X'.Copy())
let a,b,_,_ = Optimize.Train(ae.Run , ae.W', ds', p)
I evaluated all but the last line with "execute in interactive", opened AD.Float32.fs
from diffsharp (compiled as debug previously) in visual studio and put a breakpoint where I wanted, then selected last line in the script and "debug in interactive".
And apparently the library is written by a mathematician
yes but at the same time if I had to implement those algorithms based on what I read in math papers (which I'd probably have difficult time comprehend) the code would probably be using same kind of conventions :)
I've noticed that in few spots, the library takes obj
parameters and does dynamic matching (let rec pushRec (ds:(obj*obj) list) =
) and I wonder if it won't make sense to create specific return types as DU for those functions (for now that creates small allocation, but would allow to add more safety and clarity to those areas, but the compiler will allow struct DU at some point which will make the allocation overhead smaller), although I don't have much experience dealing with performance sensitive code like this.
from hype.
@smoothdeveloper my problem was not about debugging, actually. I debugged and checked all the dimensions of all the matrices and vectors. Then I thought I may miss some point along the way and I wrote down all the expected dimensions on a paper. (Yes, on a paper=)) All the dimensions are as expected. I know, this is probably my lack of understanding the underlying implementation. Thus, someone should reproduce the issue and findout if the bug is in my code or in DiffSharp or Hype.
(I suspect my code has an issue, but cannot be sure.)
from hype.
I was just pointing the detailed steps because "I combined DiffSharp and Hype into one project" in your comment (which I understood as you putting code of both libraries into a custom project).
Looking at your code there is the removeBias / replaceBias and appendBiasDM which have 2 and 1, is that correct? it looks like it could alter the matrices sizes.
from hype.
Related Issues (6)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hype.