Coder Social home page Coder Social logo

denizyuret / autograd.jl Goto Github PK

View Code? Open in Web Editor NEW
169.0 20.0 26.0 781 KB

Julia port of the Python autograd package.

License: Other

Julia 99.43% Perl 0.57%
autograd knet automatic-differentiation machine-learning deep-learning data-science neural-networks

autograd.jl's People

Contributors

carlolucibello avatar davidssmith avatar denizyuret avatar ekinakyurek avatar emreyolcu avatar gunnarfarneback avatar juliatagbot avatar mdpradeep avatar ozanarkancan avatar rfourquet avatar staticfloat avatar tkelman avatar xiaodaigh avatar ylxdzsw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

autograd.jl's Issues

float type error in ^2

on julia 0.6 when taking powers of recorded Float32 array the result is a Float64 array

julia> a=Rec(rand(Float32,3))
S[3]113R

julia> AutoGrad.unbox(a)
3-element Array{Float32,1}:
 0.988303
 0.40873 
 0.371695

julia> AutoGrad.unbox(a.^2)
3-element Array{Float64,1}:
 0.976743
 0.16706 
 0.138157

broadcast error for integer power

On master and julia 0.7

julia> grad(x->sum(x.^2))([1,2,3])
┌ Warning: broadcast will default to iterating over its arguments in the future. Wrap arguments of
│ type `x::Rec{Array{Int64,1}}` with `Ref(x)` to ensure they broadcast as "scalar" elements.
│   caller = ip:0x0
└ @ Core :-1
ERROR: DimensionMismatch("Cannot multiply two vectors")
Stacktrace:
 [1] *(::Array{Int64,1}, ::Array{Int64,1}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v0.7/LinearAlgebra/src/deprecated.jl:566
 [2] power_by_squaring(::Array{Int64,1}, ::Int64) at ./intfuncs.jl:192
 [3] ^(::Array{Int64,1}, ::Int64) at ./deprecated.jl:55
 [4] (::getfield(AutoGrad, Symbol("##rfun#7#9")){typeof(^)})(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Rec{Array{Int64,1}}, ::Vararg{Any,N} where N) at /home/carlo/.julia/dev/AutoGrad/src/core.jl:133
 [5] rfun at /home/carlo/.julia/dev/AutoGrad/src/core.jl:130 [inlined]
 [6] ^ at ./none:0 [inlined]
 [7] macro expansion at ./none:0 [inlined]
 [8] literal_pow at ./none:0 [inlined]
 [9] _broadcast_getindex_evalf at ./broadcast.jl:574 [inlined]
 [10] _broadcast_getindex at ./broadcast.jl:547 [inlined]
 [11] getindex at ./broadcast.jl:507 [inlined]
 [12] copy at ./broadcast.jl:734 [inlined]
 [13] materialize at ./broadcast.jl:724 [inlined]
 [14] (::getfield(Main, Symbol("##15#16")))(::Rec{Array{Int64,1}}) at ./REPL[11]:1
 [15] (::getfield(AutoGrad, Symbol("##gradfun#1#2")){getfield(Main, Symbol("##15#16")),Int64})(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Array{Int64,1}) at /home/carlo/.julia/dev/AutoGrad/src/core.jl:95
 [16] (::getfield(AutoGrad, Symbol("#gradfun#3")){getfield(AutoGrad, Symbol("##gradfun#1#2")){getfield(Main, Symbol("##15#16")),Int64}})(::Array{Int64,1}) at /home/carlo/.julia/dev/AutoGrad/src/core.jl:39
 [17] top-level scope at none:0

The temporary workaround would be to use a float exponent

julia> grad(x->sum(x.^2.0))([1,2,3])
3-element Array{Float64,1}:
 2.0
 4.0
 6.0

AutoGrad error in backprop when iterating dict

using AutoGrad, Knet

u = [rand(2,3), rand(2)]
v = [rand(1,2), rand(1)]
m = Dict(:u=>u, :v=>v)

x,y = rand(3,4),rand(1,4)

pred(m,x) = foldl((x,w)->w[1]*x .+ w[2], x, [m[:u],m[:v]])

loss(m,x,y) = mean(abs2, pred(m,x)-y)

∇ = grad(loss)

(m,x,y)  # OK

l2(ws) = mean(mean.(abs2, ws))

loss(m,x,y) = mean(abs2, pred(m,x)-y) + mean(l2.(collect(values(m))))

∇ = grad(loss)

loss(m,x,y)  # OK 
(m,x,y)  # Error

ERROR: MethodError: Cannot `convert` an object of type AutoGrad.Rec{Array{Array{Float64,N} where N,1}} to an object of type Array{Array{Float64,N} where N,1}
This may have arisen from a call to the constructor Array{Array{Float64,N} where N,1}(...),
since type constructors fall back to convert methods.
Stacktrace:
 [1] convert(::Type{Pair{Symbol,Array{Array{Float64,N} where N,1}}}, ::Pair{Symbol,AutoGrad.Rec{Array{Array{Float64,N} where N,1}}}) at ./pair.jl:35
 [2] copy!(::Array{Pair{Symbol,Array{Array{Float64,N} where N,1}},1}, ::AutoGrad.Rec{Dict{Symbol,Array{Array{Float64,N} where N,1}}}) at ./abstractarray.jl:575
 [3] loss(::AutoGrad.Rec{Dict{Symbol,Array{Array{Float64,N} where N,1}}}, ::Array{Float64,2}, ::Array{Float64,2}) at ./REPL[114]:1
 [4] forward_pass(::Function, ::Tuple{Dict{Symbol,Array{Array{Float64,N} where N,1}},Array{Float64,2},Array{Float64,2}}, ::Array{Any,1}, ::Int64) at /home/ngphuoc/.julia/v0.6/AutoGrad/src/core.jl:88
 [5] (::AutoGrad.##gradfun#1#3{#loss,Int64})(::Array{Any,1}, ::Function, ::Dict{Symbol,Array{Array{Float64,N} where N,1}}, ::Vararg{Any,N} where N) at /home/ngphuoc/.julia/v0.6/AutoGrad/src/core.jl:39
 [6] (::AutoGrad.#gradfun#2)(::Dict{Symbol,Array{Array{Float64,N} where N,1}}, ::Vararg{Any,N} where N) at /home/ngphuoc/.julia/v0.6/AutoGrad/src/core.jl:39

May be related to this closed issue denizyuret/Knet.jl#109

Using AutoGrad with CuArray

I modified the housing example to use CuArray as shown below. The forward pass is OK but in the backward pass it causes ERROR: MethodError: no method matching next(::AutoGrad.Rec{CuArray{Float32,2}}, ::Tuple{Base.OneTo{Int64},Int64})

using Knet,CuArrays
include(Knet.dir("data","housing.jl"))
data = housing()
x,y = data
w = Any[ 0.1f0*cu(randn(Float32,1,13)), 0.0f0 ]

predict(w,x) = w[1]*x .+ w[2]

loss(w,x,y) = mean(abs2,y-predict(w,x))
loss(w,x,y)  # 593.6816f0

lossgradient = grad(loss)
lossgradient(w,x,y)  # Error: MethodError: no method matching next(::AutoGrad.Rec{CuArray{Float32,2}}, ::Tuple{Base.OneTo{Int64},Int64})

function train(w, data; lr=.1)
  for d=data
    x,y = cu.(d)
    dw = lossgradient(w, x, y)
    for i in 1:length(w)
      w[i] -= lr * dw[i]
    end
  end
  return w
end


for i=1:10; train(w, [data]); println(loss(w,x,y)); end

ambiguity error in size(rec, dims...)

The following code works well

julia> f(x)=(p=size(x); p[1]*sum(x.^2))

julia> grad(f)(ones(3))
3-element Array{Float64,1}:
 6.0
 6.0
 6.0

and also size(x,1) doesn't give any problem, but here I have an ambiguity error

julia> f(x)=(p=size(x,(1,2)...); p[1]*sum(x.^2))

julia> grad(f)(ones(3,2,2,3))
ERROR: MethodError: size(::AutoGrad.Rec{Array{Float64,4}}, ::Int64, ::Int64, ::Int64) is ambiguous. Candidates:
  size{N}(x, d1::Integer, d2::Integer, dx::Vararg{Integer,N}) at abstractarray.jl:48
  size{##305}(x::AutoGrad.Rec{##305}, i...)
 in f(::AutoGrad.Rec{Array{Float64,4}}) at ./REPL[7]:1
 in forward_pass(::Function, ::Tuple{Array{Float64,4}}, ::Array{Any,1}, ::Int64) at /home/carlo/.julia/v0.5/AutoGrad/src/core.jl:88
 in (::AutoGrad.##gradfun#1#3{#f,Int64})(::Array{Any,1}, ::Function, ::Array{Float64,4}, ::Vararg{Array{Float64,4},N}) at /home/carlo/.julia/v0.5/AutoGrad/src/core.jl:39
 in (::AutoGrad.#gradfun#2)(::Array{Float64,4}, ::Vararg{Array{Float64,4},N}) at /home/carlo/.julia/v0.5/AutoGrad/src/core.jl:39
 in eval_user_input(::Any, ::Base.REPL.REPLBackend) at ./REPL.jl:64
 in macro expansion at ./REPL.jl:95 [inlined]
 in (::Base.REPL.##3#4{Base.REPL.REPLBackend})() at ./event.jl:68

This could be fixed easily defining

size(x::AutoGrad.Rec, d1::Integer, d2::Integer, dx::Vararg{Integer}) = size(getval(x), d1, d2, dx...)

as I already tested. If someone points me to the right location where this function should reside I can file a PR.

crossref denizyuret/Knet.jl#139

gradient type bug with special functions

Here is the problem:

julia> x=rand(Float32, 3);

julia> julia> loss(x)=sum(erf.(x))
loss (generic function with 1 method)


julia> grad(loss)(x)
3-element Array{Float64,1}:     # Float64 instead of Float32 !!! 
 1.05727 
 0.610081
 1.08713 

The same happens for erfc.

Bye
C

Stochastic Computation Graphs

Hi all !

First thanks for your awesome package !
Otherwise, do you plan handling gradients in stochastic computation graph, i.e. graph with conditional probability distributions such as

using Distributions
w = ones(5); x = rand(5);
p = 1 / (1 + exp(-vecdot(w, x)))
y = rand(Bernoulli(p, 1))
loss = (y == 1)

In Schulman, J., Heess, N., Weber, T., & Abbeel, Gradient Estimation Using Stochastic Computation Graphs., it is described how to convert the stochastic computation graph into a deterministic computation graph, to which the backpropagation algorithm can be applied to a surrogate loss function which results in an unbiased gradient estimator for our stochastic loss.

Would you know how (and how much work is required) this could be implemented with(in) your package ?

Best,
Emile

Feature request: add support for argmax

I got the following error when using argmax:

MethodError: no method matching argmax(::AutoGrad.Rec{Array{Float32,2}})

I'll use a .== maximum(a,dim) for now. Thanks.

x.^2 compared with x .* x

I tried two implementation of MSE:

MSE(x, x̂) = mean(sum((x - x̂) .* (x - x̂), 1))

and

MSE(x, x̂) = mean(sum((x - x̂).^2, 1))

, where the latter one can give me NaN in the same program but the first one works fine. Any idea why this happens?

grad of `convert`?

I need to convert type of some variables, but it crashes AutoGrad.

The code I use is:

function loss(w,x,ygold)
    ypred = predict(w,x)
    ynorm = ypred .- log(sum(exp(ypred),1))
    convert(Float32, -sum(ygold .* ynorm) / size(ygold, 2))
end

I tried @primitive like @primitive convert(T,x),dy zerograd() dy or something similar but never get it work. Is there any way to define it properly, or how to hack AutoGrad so that at least convert works without breaking other functionalities?

Autograd usage problem

I want to generate the gradient value at [1,2]. I wrote the following code to do this, but it gives an error. Thanks in advance for your help.

using Knet
J(w) = w[1]^2 .+ w[2]^2;
dJ = grad(J);
w = [1,2];
println(J(w))
println(dJ(w))

Output:
5
MethodError: ^(::AutoGrad.Rec{Int64}, ::Int64) is ambiguous. Candidates:
^(x, p::Integer) in Base at intfuncs.jl:199
^(x1::AutoGrad.Rec{##1045}, x2::##1046) where {##1045<:Number, ##1046<:Number} in AutoGrad
Possible fix, define
^(::AutoGrad.Rec{##1045<:Number}, ::Integer)

bug second derivative tanh

julia> g1 = grad(tanh)
(::gradfun) (generic function with 1 method)

julia> grad(tanh)(1.) == 1-tanh(1.)^2
true

julia> grad(g1)(2.) #BUG: returns nothing

1.0.0 Todo list

  • broadcast of user defined functions not supported: #101
  • Solve outstanding bugs and issues.
  • Review and merge pull requests. #54 #57
  • Unit testing and more gradients in base.jl.
  • Unit testing for cat.jl.
  • Unit testing for iterate.jl.
  • Unit testing for linearalgebra.jl.
  • Compare all test files with src files to check for completeness.
  • Activate codecov.
  • Add missing derivatives, check out DiffRules.jl. #51
  • Overriding broadcasted vs broadcast? Measure memory and speed. Figure out Knet functions vs AutoGrad functions. What methods are defined?
  • scan code, finish todos and optimize
  • minimize function creation: go from f(Grad{n}) to back(f,n,...), recorder(f)(x) to forw(f,x...)?
  • speed up tests by reducing compilation during gradcheck.
  • try memoization on tape
  • fix scripts under prof/ and speed test.
  • test highorder: Innes has a PR?
  • fix docs and comments and examples
  • optimize sum_outgrads, reduce memory use through memoization and more UngetIndex.
  • ::Rec .^ ::Int does not work! #80
  • tracked array interface
  • Fix the documentation so core.jl documentation can be seen by Docutils.
  • Transfer to KnetML.
  • Clear outgrad after for loop in backward_pass to save memory.
  • Missing / broken linalg gradients.
  • Figure out a better way to specify test ranges for functions.

Coding practices

@CarloLucibello I am responding to your comments here to make the discussion easier:

can we avoid exporting 2 letters names to avoid conflicts and improve code readability?

I need these 2 letter names to avoid carpal tunnel when testing :) They are not documented nor are they used in final code. Originally I had put them in a subpackage so you would need to explicitly import AutoGrad.Abbrev to use them, so I can do that again.

I'm commenting here because as usual (and quite annoyingly) commits get pushed straight to master instead of using a separate branch with a corresponding PR.

I am sorry about the sloppy practice. In the recent rewrite of the core engine, I worked on a branch called corehack for about a week, then when all tests passed I approved myself and pushed things into master. What would you recommend? How should I do it differently?

Since now Rec has been renamed to Value, we shouldn't be using value for something which is not returning a Value type. In fact, per julia conventions, something like string(x) converts x to a String. Also, what was wrong with the old names?

This was for code readability. isa(a,Rec) is not readable, isa(a,Value), isa(a,Param), isa(a,Result) are all readable. getval(x) reads like an abbreviation, value(x) is more readable. In the past we had only one type of recorded/tracked object. In the new design we have user declared ones (Param), and intermediate/final results (Result), where both types cause tracking, so they have a supertype (Value).

Having said that, the only part of the code exposed to the API is Param and value. Old names still work and give a deprecation warning.

I am not 100% happy with too many values and Values going around either, but that is now just an internal problem to AutoGrad, not to the end user. However I am open to suggestions.

@primitive and broadcast

It is not clear how to define custom derivatives and have them work in conjuction with broadcast.
Is it possible to extend the @primitive macro to handle automatically broadcasts?

Here is a workaround I'm using

using AutoGrad
import AutoGrad: Broadcasted

fun(x) = x^2
grd(x) = 0 # fake gradient

function broadcast_func(f)
    f = Symbol(lstrip(string(f), '.'))
    bf = Symbol("broadcast#", f)
    if !isdefined(bf)
        @eval begin
            # We need this when x is of a regular type (@primitive only defines bf for Rec)
            $bf(x...) = broadcast($f, x...)
            $f(x::Broadcasted...) = $bf(getval.(x)...) |> Broadcasted
            # We need the following because sometimes the interpreter does not convert all args to Broadcasted:
            $f(x1::Broadcasted, x2) = $bf(getval(x1), x2) |> Broadcasted
            $f(x1, x2::Broadcasted) = $bf(x1, getval(x2)) |> Broadcasted
        end
    end
    bf
end

bf = broadcast_func(fun)
@primitive fun(x),dy,y  (@. dy*grd(x))
if bf != fun
    @eval @primitive $bf(x),dy,y  (@. dy*grd(x))
end

grad(x->sum(fun.(x)))([1.])
#1-element Array{Float64,1}:
# 0.0

objects of type AutoGrad.Result{KnetArray{Float32,2}} are not callable

I am getting the following error when I am implementing a Knet model.

ERROR: LoadError: MethodError: objects of type AutoGrad.Result{KnetArray{Float32,2}} are not callable
Stacktrace:
 [1] #differentiate#3(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Function, ::Param{Array{KnetArray{Float32,N} where N,1}}, ::Vararg{Any,N} where N) at C:\Users\user\.julia\packages\AutoGrad\Vt8aS\src\core.jl:56
 [2] differentiate(::Function, ::Param{Array{KnetArray{Float32,N} where N,1}}, ::Vararg{Any,N} where N) at C:\Users\user\.julia\packages\AutoGrad\Vt8aS\src\core.jl:43
 [3] (::getfield(AutoGrad, Symbol("##gradfun#6#7")){typeof(loss),Int64,Bool})(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Array{KnetArray{Float32,N} where N,1}, ::Vararg{Any,N} where N) at C:\Users\user\.julia\packages\AutoGrad\Vt8aS\src\core.jl:127
 [4] (::getfield(AutoGrad, Symbol("#gradfun#8")){getfield(AutoGrad, Symbol("##gradfun#6#7")){typeof(loss),Int64,Bool}})(::Array{KnetArray{Float32,N} where N,1}, ::Vararg{Any,N} where N) at C:\Users\user\.julia\packages\AutoGrad\Vt8aS\src\core.jl:123
 [5] #epoch!#26(::Int64, ::Function, ::Array{KnetArray{Float32,N} where N,1}, ::Array{Any,1}, ::Array{Momentum,1}, ::Array{Float32,4}, ::Array{Any,1}) at C:\Users\user\Documents\julia projects\testme.jl:258
 [6] epoch!(::Array{KnetArray{Float32,N} where N,1}, ::Array{Any,1}, ::Array{Momentum,1}, ::Array{Float32,4}, ::Array{Any,1}) at C:\Users\user\Documents\julia projects\testme.jl:254
 [7] #train#30(::Type, ::Int64, ::Float64, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function) at C:\Users\user\Documents\julia projects\testme.jl:287
 [8] train() at C:\Users\user\Documents\julia projects\testme.jl:277
 [9] top-level scope at none:0
 [10] include_string(::Module, ::String, ::String) at .\loading.jl:1002
 [11] (::getfield(Atom, Symbol("##118#123")){String,String,Module})() at C:\Users\user\.julia\packages\Atom\WSz3k\src\eval.jl:120
 [12] withpath(::getfield(Atom, Symbol("##118#123")){String,String,Module}, ::String) at C:\Users\user\.julia\packages\CodeTools\8CjYJ\src\utils.jl:30
 [13] withpath at C:\Users\user\.julia\packages\Atom\WSz3k\src\eval.jl:46 [inlined]
 [14] #117 at C:\Users\user\.julia\packages\Atom\WSz3k\src\eval.jl:117 [inlined]
 [15] hideprompt(::getfield(Atom, Symbol("##117#122")){String,String,Module}) at C:\Users\user\.julia\packages\Atom\WSz3k\src\repl.jl:76
 [16] macro expansion at C:\Users\user\.julia\packages\Atom\WSz3k\src\eval.jl:116 [inlined]
 [17] (::getfield(Atom, Symbol("##116#121")){Dict{String,Any}})() at .\task.jl:85
in expression starting at C:\Users\user\Documents\julia projects\testme.jl:292

wishlist for next version

  • Julia 6 compatibility, fix warnings, fix broadcast.
  • Fix the test structure to be compatible with new BaseTest.
  • Try TakingBroadcastsSeriously.jl for broadcast ops. See JuliaLang/julia/issues/22060.
  • Consolidate gradcheck and check_grads into a single mechanism, rethink unit testing with addtest.
  • Remove eval from code, possibly responsible for post-Julia4 slowdown.
  • Fix the documentation so core.jl documentation can be seen by Docutils.
  • Provide a more flexible interface where the user decides what is boxed: the same method can also control what is tested in gradcheck.
  • Either test higher-order gradients thoroughly or remove them and simplify code.
  • Change citation in README to Knet paper.

`@dbg` macro never prints error messages for me (julia v0.5.2)

The @dbg lines don't seem to print anything for me (Version 0.5.2 (2017-05-06 16:34 UTC))

julia> AutoGrad.@dbg 1 3+3

julia> AutoGrad.@dbg 1 "Error msg."

julia> AutoGrad.@dbg 0 "Error msg."

julia> AutoGrad.@dbg -1 "Error msg."

julia>

I noticed this because I was perplexed that I didn't see an error message when I first tried the following:

julia> foo(X) = [x*x for x in X]
foo (generic function with 1 method)

julia> grad(foo)([1,2])

julia>

I now understand of course that it's because I'm trying to take the gradient of a non-scalar-value-returning function. But I was sad that I didn't see an error message.

But then I started looking through the code, and saw all these @dbg messages, but those don't seem to ever print for me. :(

julia> using AutoGrad

julia> f(x) = 3
f (generic function with 1 method)

julia> grad(f)(2)

julia>

I think that should've triggered this line?

Type of AutoGrad.Rec

Is it possible to make AutoGrad.Rec a subtype of Real so that this package can work with Distributions.jl?

Dispatching on variables w.r.t. which gradients are taken

Hi,

Firstly, I've found the package really helpful, so thanks for porting it over from Python.

It appears to be the case that one cannot do the following:

using AutoGrad
f(x::Float64) = x^2
df = grad(f)
df(5.0)

I obtain the error ERROR: MethodError: no method matching foo(::AutoGrad.Rec{Float64}). If I have interpreted this correctly, it would appear that one cannot dispatch on the type of the arguments with respect to which we are taking gradients, without making it a primitive and defining the appropriate computations involving the Jacobian. Is there a simple way to resolve this?

Thanks,
Will

missing mean(rec, 1)

while this two examples are working

julia> a=ones(2,2,2)

julia> f(a) =sum(a.^2, 1)[1]

julia> grad(f)(a)
2×2×2 Array{Float64,3}:
[:, :, 1] =
 2.0  0.0
 2.0  0.0

[:, :, 2] =
 0.0  0.0
 0.0  0.0

julia> f(a) =mean(a.^2)

julia> grad(f)(a)
2×2×2 Array{Float64,3}:
[:, :, 1] =
 0.25  0.25
 0.25  0.25

[:, :, 2] =
 0.25  0.25
 0.25  0.25

This is not

julia> f(a) =mean(a.^2,1)[1]
WARNING: Method definition f(Any) in module Main at REPL[39]:1 overwritten at REPL[41]:1.
f (generic function with 1 method)

julia> grad(f)(a)
ERROR: MethodError: no method matching mean(::AutoGrad.Rec{Array{Float64,3}}, ::Int64)
Closest candidates are:
  mean(::Union{DataType,Function}, ::Any) at statistics.jl:11
  mean{T}(::AbstractArray{T,N}, ::Any) at statistics.jl:49
  mean(::Knet.KnetArray{T,N}, ::Any) at /home/carlo/.julia/v0.5/Knet/src/reduction.jl:153
  ...
 in f(::AutoGrad.Rec{Array{Float64,3}}) at ./REPL[41]:1
 in forward_pass(::Function, ::Tuple{Array{Float64,3}}, ::Array{Any,1}, ::Int64) at /home/carlo/.julia/v0.5/AutoGrad/src/core.jl:88
 in (::AutoGrad.##gradfun#1#3{#f,Int64})(::Array{Any,1}, ::Function, ::Array{Float64,3}, ::Vararg{Array{Float64,3},N}) at /home/carlo/.julia/v0.5/AutoGrad/src/core.jl:39
 in (::AutoGrad.#gradfun#2)(::Array{Float64,3}, ::Vararg{Array{Float64,3},N}) at /home/carlo/.julia/v0.5/AutoGrad/src/core.jl:39
 in eval_user_input(::Any, ::Base.REPL.REPLBackend) at ./REPL.jl:64
 in macro expansion at ./REPL.jl:95 [inlined]
 in (::Base.REPL.##3#4{Base.REPL.REPLBackend})() at ./event.jl:68

Where is the sum function implemented, so that I can fix the issue for mean taking sum as an example?

gradcheck fails for broadcast operation

I came across with interesting bug in gradcheck. When w is initialized with zeros gradcheck fails. Otherwise it gives the correct output.

julia> using Knet
julia> w = KnetArray(zeros(1,1024));
julia> x = KnetArray(randn(14,1024));
julia> broadcasting(w,x) =  mean(x .+ w);
julia> gradcheck(broadcasting,w,x;verbose=true)

WARNING: d=0.0009765625 nd=0.0
WARNING: d=0.0009765625 nd=0.0
WARNING: d=0.0009765625 nd=0.0
WARNING: d=0.0009765625 nd=0.0
WARNING: d=0.0009765625 nd=0.0
WARNING: d=0.0009765625 nd=0.0
WARNING: d=0.0009765625 nd=0.0
WARNING: d=0.0009765625 nd=0.0
WARNING: d=0.0009765625 nd=0.0
WARNING: d=0.0009765625 nd=0.0
false

julia> w = KnetArray(randn(1,1024));
julia> gradcheck(broadcasting,w,x;verbose=true)
gcheck: d=0.0009765625 nd=0.0009765625004297092
gcheck: d=0.0009765625 nd=0.0009765624997314316
gcheck: d=0.0009765625 nd=0.0009765625004655185
gcheck: d=0.0009765625 nd=0.0009765624997135271
gcheck: d=0.0009765625 nd=0.000976562500447614
gcheck: d=0.0009765625 nd=0.000976562500165617
gcheck: d=0.0009765625 nd=0.0009765624997135271
gcheck: d=0.0009765625 nd=0.0009765624997135271
gcheck: d=0.0009765625 nd=0.0009765625004526496
gcheck: d=0.0009765625 nd=0.0009765624997493363
true

ERROR: Out of gpu memory

Hello,
First of all thanks for the nice package. I'm using it together with Knet and GPUArrays.
I'm running simulations on a deep network (VGG16) on GPU. I noticed that if I use minibatches above a certain dimension (for me, >30) the following error returns:
ERROR: Out of gpu memory in Knet.KnetPtr(::Int64) at /home/enzo/.julia/v0.5/Knet/src/kptr.jl:96 in relu(::Knet.KnetArray{Float32,4}) at /home/enzo/.julia/v0.5/Knet/src/unary.jl:85 in (::AutoGrad.##rfun#4#6{Knet.#relu})(::Array{Any,1}, ::Function, ::AutoGrad.Rec{Knet.KnetArray{Float32,4}}, ::Vararg{AutoGrad.Rec{Knet.KnetArray{Float32,4}},N}) at /home/enzo/.julia/v0.5/AutoGrad/src/core.jl:110 in relu(::AutoGrad.Rec{Knet.KnetArray{Float32,4}}) at ./<missing>:0 in convnet(::AutoGrad.Rec{Array{Any,1}}, ::Knet.KnetArray{Float32,4}) at /home/enzo/work/hebbianrule/vgg_train.jl:301 in loss(::AutoGrad.Rec{Array{Any,1}}, ::Knet.KnetArray{Float32,4}, ::Knet.KnetArray{Float32,2}) at /home/enzo/work/hebbianrule/vgg_train.jl:340 in forward_pass(::Function, ::Tuple{Array{Any,1},Knet.KnetArray{Float32,4},Knet.KnetArray{Float32,2}}, ::Array{Any,1}, ::Int64) at /home/enzo/.julia/v0.5/AutoGrad/src/core.jl:75 in (::AutoGrad.##gradfun#1#3{VGG.#loss,Int64})(::Array{Any,1}, ::Function, ::Array{Any,1}, ::Vararg{Any,N}) at /home/enzo/.julia/v0.5/AutoGrad/src/core.jl:47 in (::AutoGrad.#gradfun#2)(::Array{Any,1}, ::Vararg{Any,N}) at /home/enzo/.julia/v0.5/AutoGrad/src/core.jl:47 in #train!#3(::Int64, ::Int64, ::Float64, ::Float64, ::Float64, ::Float64, ::Function, ::Array{Any,1}, ::Array{Any,1}, ::Array{Float32,3}) at /home/enzo/work/hebbianrule/vgg_train.jl:106 in (::VGG.#kw##train!)(::Array{Any,1}, ::VGG.#train!, ::Array{Any,1}, ::Array{Any,1}, ::Array{Float32,3}) at ./<missing>:0 in main(::String) at /home/enzo/work/hebbianrule/vgg_train.jl:80

which apparently is due the way AutoGrad handles the memory. Is there a way to avoid this memory issue?

mixed Rec and KnetArray problem in vcat

when a matrix has mixed Rec and KnetArray, vcat AutoGrad only looks at first 2-3 elements here is the temporary solution:

using AutoGrad
let cat_r = recorder(cat); global vcatn
    function vcatn(a...)
        if any(x->isa(x,Rec), a)
            cat_r(1,a...)
        else
            vcat(a...)
        end
    end
end

It needs to be put into AutoGrad

non-scalar loss function

I try to circumvent the limitation of a scalar loss function by (deep) copying the recorded tape at the right moment, e.g.

import AutoGrad: Tape, Rec, backward_pass
tape = Tape()
w = rand(2, 4)
w = Rec(w, tape)
x = rand(4)
y = w * x # or a deep neural net
y2 = deepcopy(y)
endbox1 = (y[1] - 1)^2
endbox2 = (y2[2] - 3)^4
g1 = backward_pass(w, endbox1, endbox1.tapes[1])
g2 = backward_pass(w, endbox2, endbox2.tapes[1])
g = 0.1 * g1 + g2

Is there a more elegant way to achieve this?

bug with sum of number

Hi found this strange bug:

julia> a=rand(3)
3-element Array{Float64,1}:
 0.35653  
 0.0676971
 0.12812  

julia> grad(x->sum(x'x))(a)
ERROR: InexactError()
Stacktrace:
 [1] ones at ./array.jl:263 [inlined]
 [2] ones at ./array.jl:264 [inlined]
 [3] ones at ./array.jl:266 [inlined]
 [4] ones at ./<missing>:0 [inlined]
 [5] sum(::Type{AutoGrad.Grad{1}}, ::Float64, ::Float64, ::AutoGrad.Rec{Float64}) at ./<missing>:0
 [6] backward_pass(::AutoGrad.Rec{Array{Float64,1}}, ::AutoGrad.Rec{Float64}, ::Array{AutoGrad.Node,1}) at /home/carlo/.julia/v0.6/AutoGrad/src/core.jl:256
 [7] (::AutoGrad.##gradfun#1#3{##5#6,Int64})(::Array{Any,1}, ::Function, ::Array{Float64,1}, ::Vararg{Array{Float64,1},N} where N) at /home/carlo/.julia/v0.6/AutoGrad/src/core.jl:40
 [8] (::AutoGrad.#gradfun#2)(::Array{Float64,1}, ::Vararg{Array{Float64,1},N} where N) at /home/carlo/.julia/v0.6/AutoGrad/src/core.jl:39
 [9] macro expansion at /home/carlo/.julia/v0.6/Revise/src/Revise.jl:775 [inlined]
 [10] (::Revise.##17#18{Base.REPL.REPLBackend})() at ./event.jl:73

It is strange because all of the following works (altough there are some inconsistencies)

julia> grad(x->x'x)(a)
3-element Array{Float64,1}:
 0.71306 
 0.135394
 0.25624 

julia> grad(x->x'x)(1)
2

julia> grad(x->sum(x'x))(1)
1-element Array{Float64,1}:
 2.0

julia> grad(x->sum(x'))(1)
1×1 RowVector{Float64,Array{Float64,1}}:
 1.0

julia> grad(x->sum(x'))(a)
3-element Array{Float64,1}:
 1.0
 1.0
 1.0

static computation

Hi,
thanks for this nice package (and for Knet as well).
How difficult would be to support static computation, at least for a limited set of operations? Here is a comparison with ReverseDiff.jl where AutoGrad lags two orders of magnitude behind

julia> f(x) = sum(x->x^2,x)
f (generic function with 1 method)

julia> v=rand(100);

julia> @benchmark grad(f)(v)
BenchmarkTools.Trial: 
  memory estimate:  411.38 KiB
  allocs estimate:  9398
  --------------
  minimum time:     1.068 ms (0.00% GC)
  median time:      1.088 ms (0.00% GC)
  mean time:        1.182 ms (6.49% GC)
  maximum time:     5.658 ms (78.79% GC)
  --------------
  samples:          4204
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%

julia> df! = ReverseDiff.compile_gradient(f,v)
(::#301) (generic function with 1 method)

julia> y=ones(v);

julia> @benchmark df!(y,v)
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     11.353 μs (0.00% GC)
  median time:      11.426 μs (0.00% GC)
  mean time:        11.636 μs (0.00% GC)
  maximum time:     35.284 μs (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%

I encounter the same 100x slowdown if I increase the size to v=rand(1000)

Cheers,
Carlo

gradient of std(Array{Float32,N}) throws error

For example, when adding the following lines to test/statistics.jl, 6 of these tests fail:

@test gradcheck(mean, randn(Float32,2,3))
@test gradcheck(mean, randn(Float32,2,3), kwargs=[:dims=>1])
@test gradcheck(mean, randn(Float32,2,3), kwargs=[:dims=>(1,2)])
@test gradcheck(meanabs, randn(Float32,2,3))
@test gradcheck(meanabs2, randn(Float32,2,3))
@test gradcheck(var, randn(Float32,2,3))
@test gradcheck(var, randn(Float32,2,3), kwargs=[:dims=>1])
@test gradcheck(var, randn(Float32,2,3), kwargs=[:dims=>(1,2)])
@test gradcheck(std, randn(Float32,2,3))
@test gradcheck(std, randn(Float32,2,3), kwargs=[:dims=>1])
@test gradcheck(std, randn(Float32,2,3), kwargs=[:dims=>(1,2)])

It seems that the eltype of the input data is not taken into account when allocating the output:

  Expression: gradcheck(var, randn(Float32, 2, 3), kwargs=[:dims => (1, 2)])
  MethodError: no method matching sum_outgrads(::Array{Float32,2}, ::Array{Float64,2})
  Closest candidates are:
    sum_outgrads(!Matched::Nothing, ::Any) at /home/rene/.julia/dev/AutoGrad/src/core.jl:499
    sum_outgrads(::AbstractArray{T,N} where N, !Matched::AbstractArray{T,N} where N) where T at /home/rene/.julia/dev/AutoGrad/src/core.jl:486
    sum_outgrads(!Matched::Rec, ::Any) at /home/rene/.julia/dev/AutoGrad/src/core.jl:490
    ...
  Stacktrace:
   [1] backward_pass(::Rec{Array{Float32,2}}, ::Rec{Float32}, ::Array{AutoGrad.Node,1}) at /home/rene/.julia/dev/AutoGrad/src/core.jl:252
   [2] (::getfield(AutoGrad, Symbol("##gradfun#1#2")){getfield(Main, Symbol("#g#54")){getfield(Main, Symbol("##g#52#53")){typeof(var)}},Int64})(::Base.Iterators.Pairs{Symbol,Tuple{Int64,Int64},Tuple{Symbol},NamedTuple{(:dims,),Tuple{Tuple{Int64,Int64}}}}, ::Function, ::
Array{Float32,2}) at /home/rene/.julia/dev/AutoGrad/src/core.jl:41

Performance issue

Hi!
I starting to use Autograd and I have a question concerning the performance of AutoGrad compared to ReverseDiff.jl.

I have this basic setup:

using ReverseDiff, AutoGrad, BenchmarkTools

function f(x)
    m = length(x)
    return 100.0 * sum((x[i] - x[i - 1]^2)^2 for i=2:m) + (1.0 - x[1])^2
end

n = 2
x = [0.150369, 0.8463333]
u = [0.284309, 0.927797]

With ReverseDiff I do:

g = Array{Any}(n)
tape = ReverseDiff.compile(ReverseDiff.GradientTape(f, rand(n)))
F = x -> ReverseDiff.gradient!(g, tape, x)
@benchmark F(x) setup=(x=rand(2))

Which leads to:

BenchmarkTools.Trial: 
  memory estimate:  32 bytes
  allocs estimate:  2
  --------------
  minimum time:     527.174 ns (0.00% GC)
  median time:      539.411 ns (0.00% GC)
  mean time:        547.806 ns (0.19% GC)
  maximum time:     6.466 μs (88.03% GC)
  --------------
  samples:          10000
  evals/sample:     190

And with AutoGrad I do:

gradg = AutoGrad.grad(f, 1)
@benchmark gradg(x) setup=(x=rand(2))

Which leads to:

BenchmarkTools.Trial: 
  memory estimate:  11.31 KiB
  allocs estimate:  299
  --------------
  minimum time:     37.077 μs (0.00% GC)
  median time:      38.893 μs (0.00% GC)
  mean time:        41.757 μs (3.26% GC)
  maximum time:     2.893 ms (95.67% GC)
  --------------
  samples:          10000
  evals/sample:     1

So AutoGrad is much slower then ReverseDiff. I assume it's because I can precompile a tape with ReverseDiff, which makes it faster.

Is it possible to get a similar level of performance using AutoGrad?
Thanks!

should be updated for julia 0.6

currently it fails to precompile on 0.6

ulia> using AutoGrad
INFO: Recompiling stale cache file /home/carlo/.julia/lib/v0.6/AutoGrad.ji for module AutoGrad.

WARNING: deprecated syntax "typealias Tape Vector{Node}" at /home/carlo/.julia/v0.6/AutoGrad/src/core.jl:294.
Use "const Tape = Vector{Node}" instead.
WARNING: Array{T}(::Type{T}, m::Int) is deprecated, use Array{T}(m) instead.
Stacktrace:
 [1] depwarn(::String, ::Symbol) at ./deprecated.jl:64
 [2] Array(::Type{AutoGrad.Node}, ::Int64) at ./deprecated.jl:51
 [3] #Rec#10(::Function, ::Tuple{}, ::Array{Any,1}, ::Type{T} where T, ::Void, ::Array{AutoGrad.Node,1}) at /home/carlo/.julia/v0.6/AutoGrad/src/core.jl:307
 [4] AutoGrad.Rec(::Void) at /home/carlo/.julia/v0.6/AutoGrad/src/core.jl:307
 [5] include_from_node1(::String) at ./loading.jl:539
 [6] include(::String) at ./sysimg.jl:14
 [7] include_from_node1(::String) at ./loading.jl:539
 [8] include(::String) at ./sysimg.jl:14
 [9] anonymous at ./<missing>:2
 [10] eval(::Module, ::Any) at ./boot.jl:235
 [11] process_options(::Base.JLOptions) at ./client.jl:286
 [12] _start() at ./client.jl:371
while loading /home/carlo/.julia/v0.6/AutoGrad/src/core.jl, in expression starting on line 329
WARNING: Array{T}(::Type{T}, m::Int) is deprecated, use Array{T}(m) instead.
Stacktrace:
 [1] depwarn(::String, ::Symbol) at ./deprecated.jl:64
 [2] Array(::Type{AutoGrad.Node}, ::Int64) at ./deprecated.jl:51
 [3] Type at /home/carlo/.julia/v0.6/AutoGrad/src/core.jl:289 [inlined]
 [4] #Rec#10(::Function, ::Tuple{}, ::Array{Any,1}, ::Type{T} where T, ::Void, ::Array{AutoGrad.Node,1}) at /home/carlo/.julia/v0.6/AutoGrad/src/core.jl:308
 [5] AutoGrad.Rec(::Void) at /home/carlo/.julia/v0.6/AutoGrad/src/core.jl:307
 [6] include_from_node1(::String) at ./loading.jl:539
 [7] include(::String) at ./sysimg.jl:14
 [8] include_from_node1(::String) at ./loading.jl:539
 [9] include(::String) at ./sysimg.jl:14
 [10] anonymous at ./<missing>:2
 [11] eval(::Module, ::Any) at ./boot.jl:235
 [12] process_options(::Base.JLOptions) at ./client.jl:286
 [13] _start() at ./client.jl:371
while loading /home/carlo/.julia/v0.6/AutoGrad/src/core.jl, in expression starting on line 329
WARNING: Array{T}(::Type{T}, m::Int) is deprecated, use Array{T}(m) instead.
Stacktrace:
 [1] depwarn(::String, ::Symbol) at ./deprecated.jl:64
 [2] Array(::Type{AutoGrad.Node}, ::Int64) at ./deprecated.jl:51
 [3] AutoGrad.Node(::AutoGrad.Rec{Void}) at /home/carlo/.julia/v0.6/AutoGrad/src/core.jl:289
 [4] include_from_node1(::String) at ./loading.jl:539
 [5] include(::String) at ./sysimg.jl:14
 [6] include_from_node1(::String) at ./loading.jl:539
 [7] include(::String) at ./sysimg.jl:14
 [8] anonymous at ./<missing>:2
 [9] eval(::Module, ::Any) at ./boot.jl:235
 [10] process_options(::Base.JLOptions) at ./client.jl:286
 [11] _start() at ./client.jl:371
while loading /home/carlo/.julia/v0.6/AutoGrad/src/core.jl, in expression starting on line 329

WARNING: deprecated syntax "typealias D1 Type{Grad{1}}" at /home/carlo/.julia/v0.6/AutoGrad/src/util.jl:325.
Use "const D1 = Type{Grad{1}}" instead.

WARNING: deprecated syntax "typealias D2 Type{Grad{2}}" at /home/carlo/.julia/v0.6/AutoGrad/src/util.jl:326.
Use "const D2 = Type{Grad{2}}" instead.

WARNING: deprecated syntax "typealias Dn{N} Type{Grad{N}}" at /home/carlo/.julia/v0.6/AutoGrad/src/util.jl:328.
Use "Dn{N} = Type{Grad{N}}" instead.
ERROR: LoadError: LoadError: UndefVarError: decolon not defined
Stacktrace:
 [1] include_from_node1(::String) at ./loading.jl:539
 [2] include(::String) at ./sysimg.jl:14
 [3] include_from_node1(::String) at ./loading.jl:539
 [4] include(::String) at ./sysimg.jl:14
 [5] anonymous at ./<missing>:2
while loading /home/carlo/.julia/v0.6/AutoGrad/src/interfaces.jl, in expression starting on line 99
while loading /home/carlo/.julia/v0.6/AutoGrad/src/AutoGrad.jl, in expression starting on line 18
ERROR: Failed to precompile AutoGrad to /home/carlo/.julia/lib/v0.6/AutoGrad.ji.
Stacktrace:
 [1] compilecache(::String) at ./loading.jl:673
 [2] require(::Symbol) at ./loading.jl:431

view is not supported

on julia 0.7:

julia> using AutoGrad
julia> view(rand(2,2,3),2,2,1:2)
2-element view(::Array{Float64,3}, 2, 2, 1:2) with eltype Float64:
 0.26936877599285936
 0.5965847464427165 

julia> view(Rec(rand(2,2,3)),2,2,1:2)
ERROR: MethodError: no method matching view(::Rec{Array{Float64,3}}, ::Int64, ::Int64, ::UnitRange{Int64})
Closest candidates are:
  view(::AbstractArray, ::Any...) where N at subarray.jl:132
Stacktrace:
 [1] top-level scope at none:0

Once view is supported, it will be also easy to support selectdim.

High memory usage?

It's great to see Julia's collection of AD tools grow!

I tried to take the gradient of the below function, but it seemed to hang for several minutes before Julia crashes:

julia> rosenbrock(x) = sum(map((i, j) -> (1 - j)^2 + 100*(i - j^2)^2, x[2:end], x[1:end-1]))
rosenbrock (generic function with 1 method)

julia> g = AutoGrad.grad(rosenbrock)
(::gradfun) (generic function with 1 method)

julia> x = rand(100000);

julia> @time g(x) # just sits here for several minutes, then crashes

It seems to be because of the sheer amount of memory being used. I know that reverse-mode, in general, can be a memory hog, but Python's autograd library produces this gradient in about 4 seconds on my machine, so I'm suspecting there's a bug somewhere.

Issue with special functions

on julia 0.6 and autograd master

julia> using SpecialFunctions

julia> erf(1)
0.8427007929497149

julia> using AutoGrad

julia> grad(erf)(1)
ERROR: erf(1R,) has been moved to the package SpecialFunctions.jl.
Run Pkg.add("SpecialFunctions") to install SpecialFunctions on Julia v0.6 and later,
and then run `using SpecialFunctions`.
Stacktrace:
 [1] erf(::AutoGrad.Rec{Int64}) at ./deprecated.jl:1294
 [2] forward_pass(::Function, ::Tuple{Int64}, ::Array{Any,1}, ::Int64) at /home/carlo/.julia/v0.6/AutoGrad/src/core.jl:88
 [3] (::AutoGrad.##gradfun#1#3{Base.#erf,Int64})(::Array{Any,1}, ::Function, ::Int64, ::Vararg{Int64,N} where N) at /home/carlo/.julia/v0.6/AutoGrad/src/core.jl:39
 [4] (::AutoGrad.#gradfun#2)(::Int64, ::Vararg{Int64,N} where N) at /home/carlo/.julia/v0.6/AutoGrad/src/core.jl:39
 [5] macro expansion at ./REPL.jl:97 [inlined]
 [6] (::Base.REPL.##1#2{Base.REPL.REPLBackend})() at ./event.jl:73

ambiguity error in x^n

when both x and n are integers:

julia> grad(x->x^3)(1)
ERROR: MethodError: ^(::AutoGrad.Rec{Int64}, ::Int64) is ambiguous. Candidates:
  ^(x, p::Integer) in Base at intfuncs.jl:199
  ^(x1::AutoGrad.Rec{##1045}, x2::##1046) where {##1045<:Number, ##1046<:Number} in AutoGrad
Possible fix, define
  ^(::AutoGrad.Rec{##1045<:Number}, ::Integer)
Stacktrace:
 [1] (::##13#14)(::AutoGrad.Rec{Int64}) at ./REPL[9]:1
 [2] forward_pass(::Function, ::Tuple{Int64}, ::Array{Any,1}, ::Int64) at /home/carlo/.julia/v0.6/AutoGrad/src/core.jl:88
 [3] (::AutoGrad.##gradfun#1#3{##13#14,Int64})(::Array{Any,1}, ::Function, ::Int64, ::Vararg{Int64,N} where N) at /home/carlo/.julia/v0.6/AutoGrad/src/core.jl:39
 [4] (::AutoGrad.#gradfun#2)(::Int64, ::Vararg{Int64,N} where N) at /home/carlo/.julia/v0.6/AutoGrad/src/core.jl:39
 [5] macro expansion at /home/carlo/.julia/v0.6/Revise/src/Revise.jl:775 [inlined]
 [6] (::Revise.##17#18{Base.REPL.REPLBackend})() at ./event.jl:73

grad doesn't work properly for ArrayFire (OpenCL background)

Code to reproduce with Julia version 1.0.1 (2018-09-29):

julia> using ArrayFire, AutoGrad
ArrayFire v3.5.1 (OpenCL, 64-bit Linux, build 0a675e8)
[0] BEIGNET: Intel(R) HD Graphics Skylake ULT GT2, 4096 MB

julia> afx=AFArray([1f0,2f0,3f0])
AFArray: 3-element Array{Float32,1}:
 1.0
 2.0
 3.0
julia> x=Param(afx)
P(AFArray{Float32,1}(3))

julia> y=@diff sum(abs2(x))
T(14.0)

julia> grad(y,x)
AFArray: 3-element Array{Float32,1}:
 2.0
 0.0
 0.0

julia> y=@diff sum(x)
T(6.0)

julia> grad(y,x)
AFArray: 3-element Array{Float32,1}:
 1.0
 0.0
 0.0

scalar-valued function error

tan(x) = begin
    y = exp(-2x)
    return (1.0 - y) ./ (1.0 + y)
end

Out: tan (generic function with 1 method)

using Knet
dtan = grad(tan)

Out: (::gradfun) (generic function with 1 method)

println(dtan(1))

Out: 0.419974341614026

println(dtan(Any[1,2]))

Out: grad requires a scalar-valued function, got [0.761594, 0.964028]

Gradient type mismatch

grad function returns Array{Float64} gradient for a Array{Float32} input parameter when the differentiated function's output is Float64. Is this expected behaviour of grad? I think that it has to be same type with input parameters, since we use those gradients for SGD updates.

The issue occurs when I upgraded AutoGrad to the latest commit: 6acb8d5

Problem:

julia> f(x) = sum(x.*0.2)
f (generic function with 1 method)

julia> gf = grad(f)
(::gradfun) (generic function with 1 method)

julia> gf(ones(Float32,5,5))
5×5 Array{Float64,2}:
 0.2  0.2  0.2  0.2  0.2
 0.2  0.2  0.2  0.2  0.2
 0.2  0.2  0.2  0.2  0.2
 0.2  0.2  0.2  0.2  0.2
 0.2  0.2  0.2  0.2  0.2

If we remove 0.2 in sum, there is no problem:

julia> f(x) = sum(x)
f (generic function with 1 method)

julia> gf = grad(f)
(::gradfun) (generic function with 1 method)

julia> gf(ones(Float32,5,5))
5×5 Array{Float32,2}:
 1.0  1.0  1.0  1.0  1.0
 1.0  1.0  1.0  1.0  1.0
 1.0  1.0  1.0  1.0  1.0
 1.0  1.0  1.0  1.0  1.0
 1.0  1.0  1.0  1.0  1.0

AutoGrad slower in Julia5 compared to Julia4

Following is an example from Knet test scripts. Julia4 is >2x faster both in the first run and in the second run of testall(). Anybody with more insight into Julia have any idea why?

using AutoGrad

function testall()
    for f in (sum, prod, maximum, minimum, sumabs, sumabs2)
        g = grad(f)
        for t in (Float32, Float64)
            for n in (1,(1,1),2,(2,1),(1,2),(2,2))
                d = g(rand(t,n))
            end
        end
    end
end

@time testall()
@time testall()

incorrect gradient when indexing into a matrix of vectors

I'm using latest master of AutoGrad.jl in Julia 0.6.3 on Windows and it gives an unexpected gradient when indexing into a matrix of vectors:

julia> a = Matrix{Array{Float32}}(2, 1)
2×1 Array{Array{Float32,N} where N,2}:
 #undef
 #undef

julia> a[1] = [2,3,4]
3-element Array{Int64,1}:
 2
 3
 4

julia> a[2] = [4,5,6]
3-element Array{Int64,1}:
 4
 5
 6

julia> g = grad(x->x[1]' * x[2])
(::gradfun) (generic function with 1 method)

julia> g(a)
2×1 Array{Any,2}:
 Float32[4.0, 5.0, 6.0]
 Float32[2.0, 3.0, 4.0]

julia> g = grad(x->x[1, 1]' * x[2, 1])
(::gradfun) (generic function with 1 method)

julia> g(a)
2×1 Array{Any,2}:
 4.0
 2.0

The two functions are identical in my opinion, but the latter one gives only the gradient of the first element in each vector.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.