denizyuret / autograd.jl Goto Github PK

View Code? Open in Web Editor NEW

169.0 20.0 26.0 781 KB

Julia port of the Python autograd package.

License: Other

Julia 99.43% Perl 0.57%

autograd knet automatic-differentiation machine-learning deep-learning data-science neural-networks

autograd.jl's People

Contributors

Stargazers

Watchers

autograd.jl's Issues

float type error in ^2

on julia 0.6 when taking powers of recorded Float32 array the result is a Float64 array

julia> a=Rec(rand(Float32,3))
S[3]113R

julia> AutoGrad.unbox(a)
3-element Array{Float32,1}:
 0.988303
 0.40873 
 0.371695

julia> AutoGrad.unbox(a.^2)
3-element Array{Float64,1}:
 0.976743
 0.16706 
 0.138157

grad requires a scalar-valued function

Does not it have to be the solution [ [2 0] [0 2] ] ?

broadcast error for integer power

On master and julia 0.7

julia> grad(x->sum(x.^2))([1,2,3])
┌ Warning: broadcast will default to iterating over its arguments in the future. Wrap arguments of
│ type `x::Rec{Array{Int64,1}}` with `Ref(x)` to ensure they broadcast as "scalar" elements.
│   caller = ip:0x0
└ @ Core :-1
ERROR: DimensionMismatch("Cannot multiply two vectors")
Stacktrace:
 [1] *(::Array{Int64,1}, ::Array{Int64,1}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v0.7/LinearAlgebra/src/deprecated.jl:566
 [2] power_by_squaring(::Array{Int64,1}, ::Int64) at ./intfuncs.jl:192
 [3] ^(::Array{Int64,1}, ::Int64) at ./deprecated.jl:55
 [4] (::getfield(AutoGrad, Symbol("##rfun#7#9")){typeof(^)})(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Rec{Array{Int64,1}}, ::Vararg{Any,N} where N) at /home/carlo/.julia/dev/AutoGrad/src/core.jl:133
 [5] rfun at /home/carlo/.julia/dev/AutoGrad/src/core.jl:130 [inlined]
 [6] ^ at ./none:0 [inlined]
 [7] macro expansion at ./none:0 [inlined]
 [8] literal_pow at ./none:0 [inlined]
 [9] _broadcast_getindex_evalf at ./broadcast.jl:574 [inlined]
 [10] _broadcast_getindex at ./broadcast.jl:547 [inlined]
 [11] getindex at ./broadcast.jl:507 [inlined]
 [12] copy at ./broadcast.jl:734 [inlined]
 [13] materialize at ./broadcast.jl:724 [inlined]
 [14] (::getfield(Main, Symbol("##15#16")))(::Rec{Array{Int64,1}}) at ./REPL[11]:1
 [15] (::getfield(AutoGrad, Symbol("##gradfun#1#2")){getfield(Main, Symbol("##15#16")),Int64})(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Array{Int64,1}) at /home/carlo/.julia/dev/AutoGrad/src/core.jl:95
 [16] (::getfield(AutoGrad, Symbol("#gradfun#3")){getfield(AutoGrad, Symbol("##gradfun#1#2")){getfield(Main, Symbol("##15#16")),Int64}})(::Array{Int64,1}) at /home/carlo/.julia/dev/AutoGrad/src/core.jl:39
 [17] top-level scope at none:0

The temporary workaround would be to use a float exponent

julia> grad(x->sum(x.^2.0))([1,2,3])
3-element Array{Float64,1}:
 2.0
 4.0
 6.0

AutoGrad error in backprop when iterating dict

using AutoGrad, Knet

u = [rand(2,3), rand(2)]
v = [rand(1,2), rand(1)]
m = Dict(:u=>u, :v=>v)

x,y = rand(3,4),rand(1,4)

pred(m,x) = foldl((x,w)->w[1]*x .+ w[2], x, [m[:u],m[:v]])

loss(m,x,y) = mean(abs2, pred(m,x)-y)

∇ = grad(loss)

∇(m,x,y)  # OK

l2(ws) = mean(mean.(abs2, ws))

loss(m,x,y) = mean(abs2, pred(m,x)-y) + mean(l2.(collect(values(m))))

∇ = grad(loss)

loss(m,x,y)  # OK 
∇(m,x,y)  # Error

ERROR: MethodError: Cannot `convert` an object of type AutoGrad.Rec{Array{Array{Float64,N} where N,1}} to an object of type Array{Array{Float64,N} where N,1}
This may have arisen from a call to the constructor Array{Array{Float64,N} where N,1}(...),
since type constructors fall back to convert methods.
Stacktrace:
 [1] convert(::Type{Pair{Symbol,Array{Array{Float64,N} where N,1}}}, ::Pair{Symbol,AutoGrad.Rec{Array{Array{Float64,N} where N,1}}}) at ./pair.jl:35
 [2] copy!(::Array{Pair{Symbol,Array{Array{Float64,N} where N,1}},1}, ::AutoGrad.Rec{Dict{Symbol,Array{Array{Float64,N} where N,1}}}) at ./abstractarray.jl:575
 [3] loss(::AutoGrad.Rec{Dict{Symbol,Array{Array{Float64,N} where N,1}}}, ::Array{Float64,2}, ::Array{Float64,2}) at ./REPL[114]:1
 [4] forward_pass(::Function, ::Tuple{Dict{Symbol,Array{Array{Float64,N} where N,1}},Array{Float64,2},Array{Float64,2}}, ::Array{Any,1}, ::Int64) at /home/ngphuoc/.julia/v0.6/AutoGrad/src/core.jl:88
 [5] (::AutoGrad.##gradfun#1#3{#loss,Int64})(::Array{Any,1}, ::Function, ::Dict{Symbol,Array{Array{Float64,N} where N,1}}, ::Vararg{Any,N} where N) at /home/ngphuoc/.julia/v0.6/AutoGrad/src/core.jl:39
 [6] (::AutoGrad.#gradfun#2)(::Dict{Symbol,Array{Array{Float64,N} where N,1}}, ::Vararg{Any,N} where N) at /home/ngphuoc/.julia/v0.6/AutoGrad/src/core.jl:39

May be related to this closed issue denizyuret/Knet.jl#109

I modified the housing example to use CuArray as shown below. The forward pass is OK but in the backward pass it causes ERROR: MethodError: no method matching next(::AutoGrad.Rec{CuArray{Float32,2}}, ::Tuple{Base.OneTo{Int64},Int64})

using Knet,CuArrays
include(Knet.dir("data","housing.jl"))
data = housing()
x,y = data
w = Any[ 0.1f0*cu(randn(Float32,1,13)), 0.0f0 ]

predict(w,x) = w[1]*x .+ w[2]

loss(w,x,y) = mean(abs2,y-predict(w,x))
loss(w,x,y)  # 593.6816f0

lossgradient = grad(loss)
lossgradient(w,x,y)  # Error: MethodError: no method matching next(::AutoGrad.Rec{CuArray{Float32,2}}, ::Tuple{Base.OneTo{Int64},Int64})

function train(w, data; lr=.1)
  for d=data
    x,y = cu.(d)
    dw = lossgradient(w, x, y)
    for i in 1:length(w)
      w[i] -= lr * dw[i]
    end
  end
  return w
end


for i=1:10; train(w, [data]); println(loss(w,x,y)); end

missing algebra gradients

useful reading https://arxiv.org/pdf/1710.08717.pdf

ambiguity error in size(rec, dims...)

The following code works well

julia> f(x)=(p=size(x); p[1]*sum(x.^2))

julia> grad(f)(ones(3))
3-element Array{Float64,1}:
 6.0
 6.0
 6.0

and also size(x,1) doesn't give any problem, but here I have an ambiguity error

julia> f(x)=(p=size(x,(1,2)...); p[1]*sum(x.^2))

julia> grad(f)(ones(3,2,2,3))
ERROR: MethodError: size(::AutoGrad.Rec{Array{Float64,4}}, ::Int64, ::Int64, ::Int64) is ambiguous. Candidates:
  size{N}(x, d1::Integer, d2::Integer, dx::Vararg{Integer,N}) at abstractarray.jl:48
  size{##305}(x::AutoGrad.Rec{##305}, i...)
 in f(::AutoGrad.Rec{Array{Float64,4}}) at ./REPL[7]:1
 in forward_pass(::Function, ::Tuple{Array{Float64,4}}, ::Array{Any,1}, ::Int64) at /home/carlo/.julia/v0.5/AutoGrad/src/core.jl:88
 in (::AutoGrad.##gradfun#1#3{#f,Int64})(::Array{Any,1}, ::Function, ::Array{Float64,4}, ::Vararg{Array{Float64,4},N}) at /home/carlo/.julia/v0.5/AutoGrad/src/core.jl:39
 in (::AutoGrad.#gradfun#2)(::Array{Float64,4}, ::Vararg{Array{Float64,4},N}) at /home/carlo/.julia/v0.5/AutoGrad/src/core.jl:39
 in eval_user_input(::Any, ::Base.REPL.REPLBackend) at ./REPL.jl:64
 in macro expansion at ./REPL.jl:95 [inlined]
 in (::Base.REPL.##3#4{Base.REPL.REPLBackend})() at ./event.jl:68

This could be fixed easily defining

size(x::AutoGrad.Rec, d1::Integer, d2::Integer, dx::Vararg{Integer}) = size(getval(x), d1, d2, dx...)

as I already tested. If someone points me to the right location where this function should reside I can file a PR.

crossref denizyuret/Knet.jl#139

integrate DiffRules

it would be good to slim down autograd using DiffRules.jl:
https://github.com/JuliaDiff/DiffRules.jl/blob/master/src/rules.jl

gradient type bug with special functions

Here is the problem:

julia> x=rand(Float32, 3);

julia> julia> loss(x)=sum(erf.(x))
loss (generic function with 1 method)


julia> grad(loss)(x)
3-element Array{Float64,1}:     # Float64 instead of Float32 !!! 
 1.05727 
 0.610081
 1.08713

The same happens for erfc.

Bye
C

Stochastic Computation Graphs

Hi all !

First thanks for your awesome package !
Otherwise, do you plan handling gradients in stochastic computation graph, i.e. graph with conditional probability distributions such as

using Distributions
w = ones(5); x = rand(5);
p = 1 / (1 + exp(-vecdot(w, x)))
y = rand(Bernoulli(p, 1))
loss = (y == 1)

In Schulman, J., Heess, N., Weber, T., & Abbeel, Gradient Estimation Using Stochastic Computation Graphs., it is described how to convert the stochastic computation graph into a deterministic computation graph, to which the backpropagation algorithm can be applied to a surrogate loss function which results in an unbiased gradient estimator for our stochastic loss.

Would you know how (and how much work is required) this could be implemented with(in) your package ?

Best,
Emile

Feature request: add support for argmax

I got the following error when using argmax:

MethodError: no method matching argmax(::AutoGrad.Rec{Array{Float32,2}})

I'll use a .== maximum(a,dim) for now. Thanks.

x.^2 compared with x .* x

I tried two implementation of MSE:

MSE(x, x̂) = mean(sum((x - x̂) .* (x - x̂), 1))

and

MSE(x, x̂) = mean(sum((x - x̂).^2, 1))

, where the latter one can give me NaN in the same program but the first one works fine. Any idea why this happens?

grad of `convert`?

I need to convert type of some variables, but it crashes AutoGrad.

The code I use is:

function loss(w,x,ygold)
    ypred = predict(w,x)
    ynorm = ypred .- log(sum(exp(ypred),1))
    convert(Float32, -sum(ygold .* ynorm) / size(ygold, 2))
end

I tried @primitive like @primitive convert(T,x),dy zerograd() dy or something similar but never get it work. Is there any way to define it properly, or how to hack AutoGrad so that at least convert works without breaking other functionalities?

Autograd usage problem

I want to generate the gradient value at [1,2]. I wrote the following code to do this, but it gives an error. Thanks in advance for your help.

using Knet
J(w) = w[1]^2 .+ w[2]^2;
dJ = grad(J);
w = [1,2];
println(J(w))
println(dJ(w))

Output:
5
MethodError: ^(::AutoGrad.Rec{Int64}, ::Int64) is ambiguous. Candidates:
^(x, p::Integer) in Base at intfuncs.jl:199
^(x1::AutoGrad.Rec{##1045}, x2::##1046) where {##1045<:Number, ##1046<:Number} in AutoGrad
Possible fix, define
^(::AutoGrad.Rec{##1045<:Number}, ::Integer)

bug second derivative tanh

julia> g1 = grad(tanh)
(::gradfun) (generic function with 1 method)

julia> grad(tanh)(1.) == 1-tanh(1.)^2
true

julia> grad(g1)(2.) #BUG: returns nothing

1.0.0 Todo list

Coding practices

@CarloLucibello I am responding to your comments here to make the discussion easier:

can we avoid exporting 2 letters names to avoid conflicts and improve code readability?

I need these 2 letter names to avoid carpal tunnel when testing :) They are not documented nor are they used in final code. Originally I had put them in a subpackage so you would need to explicitly import AutoGrad.Abbrev to use them, so I can do that again.

I'm commenting here because as usual (and quite annoyingly) commits get pushed straight to master instead of using a separate branch with a corresponding PR.

I am sorry about the sloppy practice. In the recent rewrite of the core engine, I worked on a branch called corehack for about a week, then when all tests passed I approved myself and pushed things into master. What would you recommend? How should I do it differently?

Since now Rec has been renamed to Value, we shouldn't be using value for something which is not returning a Value type. In fact, per julia conventions, something like string(x) converts x to a String. Also, what was wrong with the old names?

This was for code readability. isa(a,Rec) is not readable, isa(a,Value), isa(a,Param), isa(a,Result) are all readable. getval(x) reads like an abbreviation, value(x) is more readable. In the past we had only one type of recorded/tracked object. In the new design we have user declared ones (Param), and intermediate/final results (Result), where both types cause tracking, so they have a supertype (Value).

Having said that, the only part of the code exposed to the API is Param and value. Old names still work and give a deprecation warning.

I am not 100% happy with too many values and Values going around either, but that is now just an internal problem to AutoGrad, not to the end user. However I am open to suggestions.

third gradient of exp(x*x) gives nothing

@primitive and broadcast

It is not clear how to define custom derivatives and have them work in conjuction with broadcast.
Is it possible to extend the @primitive macro to handle automatically broadcasts?

Here is a workaround I'm using

using AutoGrad
import AutoGrad: Broadcasted

fun(x) = x^2
grd(x) = 0 # fake gradient

function broadcast_func(f)
    f = Symbol(lstrip(string(f), '.'))
    bf = Symbol("broadcast#", f)
    if !isdefined(bf)
        @eval begin
            # We need this when x is of a regular type (@primitive only defines bf for Rec)
            $bf(x...) = broadcast($f, x...)
            $f(x::Broadcasted...) = $bf(getval.(x)...) |> Broadcasted
            # We need the following because sometimes the interpreter does not convert all args to Broadcasted:
            $f(x1::Broadcasted, x2) = $bf(getval(x1), x2) |> Broadcasted
            $f(x1, x2::Broadcasted) = $bf(x1, getval(x2)) |> Broadcasted
        end
    end
    bf
end

bf = broadcast_func(fun)
@primitive fun(x),dy,y  (@. dy*grd(x))
if bf != fun
    @eval @primitive $bf(x),dy,y  (@. dy*grd(x))
end

grad(x->sum(fun.(x)))([1.])
#1-element Array{Float64,1}:
# 0.0

objects of type AutoGrad.Result{KnetArray{Float32,2}} are not callable

I am getting the following error when I am implementing a Knet model.

ERROR: LoadError: MethodError: objects of type AutoGrad.Result{KnetArray{Float32,2}} are not callable
Stacktrace:
 [1] #differentiate#3(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Function, ::Param{Array{KnetArray{Float32,N} where N,1}}, ::Vararg{Any,N} where N) at C:\Users\user\.julia\packages\AutoGrad\Vt8aS\src\core.jl:56
 [2] differentiate(::Function, ::Param{Array{KnetArray{Float32,N} where N,1}}, ::Vararg{Any,N} where N) at C:\Users\user\.julia\packages\AutoGrad\Vt8aS\src\core.jl:43
 [3] (::getfield(AutoGrad, Symbol("##gradfun#6#7")){typeof(loss),Int64,Bool})(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Array{KnetArray{Float32,N} where N,1}, ::Vararg{Any,N} where N) at C:\Users\user\.julia\packages\AutoGrad\Vt8aS\src\core.jl:127
 [4] (::getfield(AutoGrad, Symbol("#gradfun#8")){getfield(AutoGrad, Symbol("##gradfun#6#7")){typeof(loss),Int64,Bool}})(::Array{KnetArray{Float32,N} where N,1}, ::Vararg{Any,N} where N) at C:\Users\user\.julia\packages\AutoGrad\Vt8aS\src\core.jl:123
 [5] #epoch!#26(::Int64, ::Function, ::Array{KnetArray{Float32,N} where N,1}, ::Array{Any,1}, ::Array{Momentum,1}, ::Array{Float32,4}, ::Array{Any,1}) at C:\Users\user\Documents\julia projects\testme.jl:258
 [6] epoch!(::Array{KnetArray{Float32,N} where N,1}, ::Array{Any,1}, ::Array{Momentum,1}, ::Array{Float32,4}, ::Array{Any,1}) at C:\Users\user\Documents\julia projects\testme.jl:254
 [7] #train#30(::Type, ::Int64, ::Float64, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function) at C:\Users\user\Documents\julia projects\testme.jl:287
 [8] train() at C:\Users\user\Documents\julia projects\testme.jl:277
 [9] top-level scope at none:0
 [10] include_string(::Module, ::String, ::String) at .\loading.jl:1002
 [11] (::getfield(Atom, Symbol("##118#123")){String,String,Module})() at C:\Users\user\.julia\packages\Atom\WSz3k\src\eval.jl:120
 [12] withpath(::getfield(Atom, Symbol("##118#123")){String,String,Module}, ::String) at C:\Users\user\.julia\packages\CodeTools\8CjYJ\src\utils.jl:30
 [13] withpath at C:\Users\user\.julia\packages\Atom\WSz3k\src\eval.jl:46 [inlined]
 [14] #117 at C:\Users\user\.julia\packages\Atom\WSz3k\src\eval.jl:117 [inlined]
 [15] hideprompt(::getfield(Atom, Symbol("##117#122")){String,String,Module}) at C:\Users\user\.julia\packages\Atom\WSz3k\src\repl.jl:76
 [16] macro expansion at C:\Users\user\.julia\packages\Atom\WSz3k\src\eval.jl:116 [inlined]
 [17] (::getfield(Atom, Symbol("##116#121")){Dict{String,Any}})() at .\task.jl:85
in expression starting at C:\Users\user\Documents\julia projects\testme.jl:292

wishlist for next version

Julia 6 compatibility, fix warnings, fix broadcast.
Fix the test structure to be compatible with new BaseTest.
Try TakingBroadcastsSeriously.jl for broadcast ops. See JuliaLang/julia/issues/22060.
Consolidate gradcheck and check_grads into a single mechanism, rethink unit testing with addtest.
Remove eval from code, possibly responsible for post-Julia4 slowdown.
Fix the documentation so core.jl documentation can be seen by Docutils.
Provide a more flexible interface where the user decides what is boxed: the same method can also control what is tested in gradcheck.
Either test higher-order gradients thoroughly or remove them and simplify code.
Change citation in README to Knet paper.

`@dbg` macro never prints error messages for me (julia v0.5.2)

The @dbg lines don't seem to print anything for me (Version 0.5.2 (2017-05-06 16:34 UTC))

julia> AutoGrad.@dbg 1 3+3

julia> AutoGrad.@dbg 1 "Error msg."

julia> AutoGrad.@dbg 0 "Error msg."

julia> AutoGrad.@dbg -1 "Error msg."

julia>

I noticed this because I was perplexed that I didn't see an error message when I first tried the following:

julia> foo(X) = [x*x for x in X]
foo (generic function with 1 method)

julia> grad(foo)([1,2])

julia>

I now understand of course that it's because I'm trying to take the gradient of a non-scalar-value-returning function. But I was sad that I didn't see an error message.

But then I started looking through the code, and saw all these @dbg messages, but those don't seem to ever print for me. :(

julia> using AutoGrad

julia> f(x) = 3
f (generic function with 1 method)

julia> grad(f)(2)

julia>

I think that should've triggered this line?

Type of AutoGrad.Rec

Is it possible to make AutoGrad.Rec a subtype of Real so that this package can work with Distributions.jl?

Dispatching on variables w.r.t. which gradients are taken

Hi,

Firstly, I've found the package really helpful, so thanks for porting it over from Python.

It appears to be the case that one cannot do the following:

using AutoGrad
f(x::Float64) = x^2
df = grad(f)
df(5.0)

I obtain the error ERROR: MethodError: no method matching foo(::AutoGrad.Rec{Float64}). If I have interpreted this correctly, it would appear that one cannot dispatch on the type of the arguments with respect to which we are taking gradients, without making it a primitive and defining the appropriate computations involving the Jacobian. Is there a simple way to resolve this?

Thanks,
Will

missing mean(rec, 1)

while this two examples are working

julia> a=ones(2,2,2)

julia> f(a) =sum(a.^2, 1)[1]

julia> grad(f)(a)
2×2×2 Array{Float64,3}:
[:, :, 1] =
 2.0  0.0
 2.0  0.0

[:, :, 2] =
 0.0  0.0
 0.0  0.0

julia> f(a) =mean(a.^2)

julia> grad(f)(a)
2×2×2 Array{Float64,3}:
[:, :, 1] =
 0.25  0.25
 0.25  0.25

[:, :, 2] =
 0.25  0.25
 0.25  0.25

This is not

julia> f(a) =mean(a.^2,1)[1]
WARNING: Method definition f(Any) in module Main at REPL[39]:1 overwritten at REPL[41]:1.
f (generic function with 1 method)

julia> grad(f)(a)
ERROR: MethodError: no method matching mean(::AutoGrad.Rec{Array{Float64,3}}, ::Int64)
Closest candidates are:
  mean(::Union{DataType,Function}, ::Any) at statistics.jl:11
  mean{T}(::AbstractArray{T,N}, ::Any) at statistics.jl:49
  mean(::Knet.KnetArray{T,N}, ::Any) at /home/carlo/.julia/v0.5/Knet/src/reduction.jl:153
  ...
 in f(::AutoGrad.Rec{Array{Float64,3}}) at ./REPL[41]:1
 in forward_pass(::Function, ::Tuple{Array{Float64,3}}, ::Array{Any,1}, ::Int64) at /home/carlo/.julia/v0.5/AutoGrad/src/core.jl:88
 in (::AutoGrad.##gradfun#1#3{#f,Int64})(::Array{Any,1}, ::Function, ::Array{Float64,3}, ::Vararg{Array{Float64,3},N}) at /home/carlo/.julia/v0.5/AutoGrad/src/core.jl:39
 in (::AutoGrad.#gradfun#2)(::Array{Float64,3}, ::Vararg{Array{Float64,3},N}) at /home/carlo/.julia/v0.5/AutoGrad/src/core.jl:39
 in eval_user_input(::Any, ::Base.REPL.REPLBackend) at ./REPL.jl:64
 in macro expansion at ./REPL.jl:95 [inlined]
 in (::Base.REPL.##3#4{Base.REPL.REPLBackend})() at ./event.jl:68

Where is the sum function implemented, so that I can fix the issue for mean taking sum as an example?

gradcheck fails for broadcast operation

I came across with interesting bug in gradcheck. When w is initialized with zeros gradcheck fails. Otherwise it gives the correct output.

julia> using Knet
julia> w = KnetArray(zeros(1,1024));
julia> x = KnetArray(randn(14,1024));
julia> broadcasting(w,x) =  mean(x .+ w);
julia> gradcheck(broadcasting,w,x;verbose=true)

WARNING: d=0.0009765625 nd=0.0
WARNING: d=0.0009765625 nd=0.0
WARNING: d=0.0009765625 nd=0.0
WARNING: d=0.0009765625 nd=0.0
WARNING: d=0.0009765625 nd=0.0
WARNING: d=0.0009765625 nd=0.0
WARNING: d=0.0009765625 nd=0.0
WARNING: d=0.0009765625 nd=0.0
WARNING: d=0.0009765625 nd=0.0
WARNING: d=0.0009765625 nd=0.0
false

julia> w = KnetArray(randn(1,1024));
julia> gradcheck(broadcasting,w,x;verbose=true)
gcheck: d=0.0009765625 nd=0.0009765625004297092
gcheck: d=0.0009765625 nd=0.0009765624997314316
gcheck: d=0.0009765625 nd=0.0009765625004655185
gcheck: d=0.0009765625 nd=0.0009765624997135271
gcheck: d=0.0009765625 nd=0.000976562500447614
gcheck: d=0.0009765625 nd=0.000976562500165617
gcheck: d=0.0009765625 nd=0.0009765624997135271
gcheck: d=0.0009765625 nd=0.0009765624997135271
gcheck: d=0.0009765625 nd=0.0009765625004526496
gcheck: d=0.0009765625 nd=0.0009765624997493363
true

ERROR: Out of gpu memory

Hello,
First of all thanks for the nice package. I'm using it together with Knet and GPUArrays.
I'm running simulations on a deep network (VGG16) on GPU. I noticed that if I use minibatches above a certain dimension (for me, >30) the following error returns:
ERROR: Out of gpu memory in Knet.KnetPtr(::Int64) at /home/enzo/.julia/v0.5/Knet/src/kptr.jl:96 in relu(::Knet.KnetArray{Float32,4}) at /home/enzo/.julia/v0.5/Knet/src/unary.jl:85 in (::AutoGrad.##rfun#4#6{Knet.#relu})(::Array{Any,1}, ::Function, ::AutoGrad.Rec{Knet.KnetArray{Float32,4}}, ::Vararg{AutoGrad.Rec{Knet.KnetArray{Float32,4}},N}) at /home/enzo/.julia/v0.5/AutoGrad/src/core.jl:110 in relu(::AutoGrad.Rec{Knet.KnetArray{Float32,4}}) at ./<missing>:0 in convnet(::AutoGrad.Rec{Array{Any,1}}, ::Knet.KnetArray{Float32,4}) at /home/enzo/work/hebbianrule/vgg_train.jl:301 in loss(::AutoGrad.Rec{Array{Any,1}}, ::Knet.KnetArray{Float32,4}, ::Knet.KnetArray{Float32,2}) at /home/enzo/work/hebbianrule/vgg_train.jl:340 in forward_pass(::Function, ::Tuple{Array{Any,1},Knet.KnetArray{Float32,4},Knet.KnetArray{Float32,2}}, ::Array{Any,1}, ::Int64) at /home/enzo/.julia/v0.5/AutoGrad/src/core.jl:75 in (::AutoGrad.##gradfun#1#3{VGG.#loss,Int64})(::Array{Any,1}, ::Function, ::Array{Any,1}, ::Vararg{Any,N}) at /home/enzo/.julia/v0.5/AutoGrad/src/core.jl:47 in (::AutoGrad.#gradfun#2)(::Array{Any,1}, ::Vararg{Any,N}) at /home/enzo/.julia/v0.5/AutoGrad/src/core.jl:47 in #train!#3(::Int64, ::Int64, ::Float64, ::Float64, ::Float64, ::Float64, ::Function, ::Array{Any,1}, ::Array{Any,1}, ::Array{Float32,3}) at /home/enzo/work/hebbianrule/vgg_train.jl:106 in (::VGG.#kw##train!)(::Array{Any,1}, ::VGG.#train!, ::Array{Any,1}, ::Array{Any,1}, ::Array{Float32,3}) at ./<missing>:0 in main(::String) at /home/enzo/work/hebbianrule/vgg_train.jl:80

which apparently is due the way AutoGrad handles the memory. Is there a way to avoid this memory issue?

mixed Rec and KnetArray problem in vcat

when a matrix has mixed Rec and KnetArray, vcat AutoGrad only looks at first 2-3 elements here is the temporary solution:

using AutoGrad
let cat_r = recorder(cat); global vcatn
    function vcatn(a...)
        if any(x->isa(x,Rec), a)
            cat_r(1,a...)
        else
            vcat(a...)
        end
    end
end

It needs to be put into AutoGrad

missing Jacobian and Hessian methods

eventually trough Jacobian-vector product. Nice related blog post https://j-towns.github.io/2017/06/12/A-new-trick.html

non-scalar loss function

I try to circumvent the limitation of a scalar loss function by (deep) copying the recorded tape at the right moment, e.g.

import AutoGrad: Tape, Rec, backward_pass
tape = Tape()
w = rand(2, 4)
w = Rec(w, tape)
x = rand(4)
y = w * x # or a deep neural net
y2 = deepcopy(y)
endbox1 = (y[1] - 1)^2
endbox2 = (y2[2] - 3)^4
g1 = backward_pass(w, endbox1, endbox1.tapes[1])
g2 = backward_pass(w, endbox2, endbox2.tapes[1])
g = 0.1 * g1 + g2

Is there a more elegant way to achieve this?

bug with sum of number

Hi found this strange bug:

julia> a=rand(3)
3-element Array{Float64,1}:
 0.35653  
 0.0676971
 0.12812  

julia> grad(x->sum(x'x))(a)
ERROR: InexactError()
Stacktrace:
 [1] ones at ./array.jl:263 [inlined]
 [2] ones at ./array.jl:264 [inlined]
 [3] ones at ./array.jl:266 [inlined]
 [4] ones at ./<missing>:0 [inlined]
 [5] sum(::Type{AutoGrad.Grad{1}}, ::Float64, ::Float64, ::AutoGrad.Rec{Float64}) at ./<missing>:0
 [6] backward_pass(::AutoGrad.Rec{Array{Float64,1}}, ::AutoGrad.Rec{Float64}, ::Array{AutoGrad.Node,1}) at /home/carlo/.julia/v0.6/AutoGrad/src/core.jl:256
 [7] (::AutoGrad.##gradfun#1#3{##5#6,Int64})(::Array{Any,1}, ::Function, ::Array{Float64,1}, ::Vararg{Array{Float64,1},N} where N) at /home/carlo/.julia/v0.6/AutoGrad/src/core.jl:40
 [8] (::AutoGrad.#gradfun#2)(::Array{Float64,1}, ::Vararg{Array{Float64,1},N} where N) at /home/carlo/.julia/v0.6/AutoGrad/src/core.jl:39
 [9] macro expansion at /home/carlo/.julia/v0.6/Revise/src/Revise.jl:775 [inlined]
 [10] (::Revise.##17#18{Base.REPL.REPLBackend})() at ./event.jl:73

It is strange because all of the following works (altough there are some inconsistencies)

julia> grad(x->x'x)(a)
3-element Array{Float64,1}:
 0.71306 
 0.135394
 0.25624 

julia> grad(x->x'x)(1)
2

julia> grad(x->sum(x'x))(1)
1-element Array{Float64,1}:
 2.0

julia> grad(x->sum(x'))(1)
1×1 RowVector{Float64,Array{Float64,1}}:
 1.0

julia> grad(x->sum(x'))(a)
3-element Array{Float64,1}:
 1.0
 1.0
 1.0

Is the example in README "extending AutoGrad" correct?

In current README file https://github.com/denizyuret/AutoGrad.jl#extending-autograd , it says

@primitive hypot(x1::Number,x2::Number),dy,y  (dy->dy*x1/y)  (dy->dy*x2/y)

However, this defines two functions as the grads. I think the proper way to define it should be

@primitive hypot(x1::Number,x2::Number),dy,y  (dy*x1/y)  (dy*x2/y)

static computation

Hi,
thanks for this nice package (and for Knet as well).
How difficult would be to support static computation, at least for a limited set of operations? Here is a comparison with ReverseDiff.jl where AutoGrad lags two orders of magnitude behind

julia> f(x) = sum(x->x^2,x)
f (generic function with 1 method)

julia> v=rand(100);

julia> @benchmark grad(f)(v)
BenchmarkTools.Trial: 
  memory estimate:  411.38 KiB
  allocs estimate:  9398
  --------------
  minimum time:     1.068 ms (0.00% GC)
  median time:      1.088 ms (0.00% GC)
  mean time:        1.182 ms (6.49% GC)
  maximum time:     5.658 ms (78.79% GC)
  --------------
  samples:          4204
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%

julia> df! = ReverseDiff.compile_gradient(f,v)
(::#301) (generic function with 1 method)

julia> y=ones(v);

julia> @benchmark df!(y,v)
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     11.353 μs (0.00% GC)
  median time:      11.426 μs (0.00% GC)
  mean time:        11.636 μs (0.00% GC)
  maximum time:     35.284 μs (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%

I encounter the same 100x slowdown if I increase the size to v=rand(1000)

Cheers,
Carlo

gradient of std(Array{Float32,N}) throws error

For example, when adding the following lines to test/statistics.jl, 6 of these tests fail:

@test gradcheck(mean, randn(Float32,2,3))
@test gradcheck(mean, randn(Float32,2,3), kwargs=[:dims=>1])
@test gradcheck(mean, randn(Float32,2,3), kwargs=[:dims=>(1,2)])
@test gradcheck(meanabs, randn(Float32,2,3))
@test gradcheck(meanabs2, randn(Float32,2,3))
@test gradcheck(var, randn(Float32,2,3))
@test gradcheck(var, randn(Float32,2,3), kwargs=[:dims=>1])
@test gradcheck(var, randn(Float32,2,3), kwargs=[:dims=>(1,2)])
@test gradcheck(std, randn(Float32,2,3))
@test gradcheck(std, randn(Float32,2,3), kwargs=[:dims=>1])
@test gradcheck(std, randn(Float32,2,3), kwargs=[:dims=>(1,2)])

It seems that the eltype of the input data is not taken into account when allocating the output:

  Expression: gradcheck(var, randn(Float32, 2, 3), kwargs=[:dims => (1, 2)])
  MethodError: no method matching sum_outgrads(::Array{Float32,2}, ::Array{Float64,2})
  Closest candidates are:
    sum_outgrads(!Matched::Nothing, ::Any) at /home/rene/.julia/dev/AutoGrad/src/core.jl:499
    sum_outgrads(::AbstractArray{T,N} where N, !Matched::AbstractArray{T,N} where N) where T at /home/rene/.julia/dev/AutoGrad/src/core.jl:486
    sum_outgrads(!Matched::Rec, ::Any) at /home/rene/.julia/dev/AutoGrad/src/core.jl:490
    ...
  Stacktrace:
   [1] backward_pass(::Rec{Array{Float32,2}}, ::Rec{Float32}, ::Array{AutoGrad.Node,1}) at /home/rene/.julia/dev/AutoGrad/src/core.jl:252
   [2] (::getfield(AutoGrad, Symbol("##gradfun#1#2")){getfield(Main, Symbol("#g#54")){getfield(Main, Symbol("##g#52#53")){typeof(var)}},Int64})(::Base.Iterators.Pairs{Symbol,Tuple{Int64,Int64},Tuple{Symbol},NamedTuple{(:dims,),Tuple{Tuple{Int64,Int64}}}}, ::Function, ::
Array{Float32,2}) at /home/rene/.julia/dev/AutoGrad/src/core.jl:41

Performance issue

Hi!
I starting to use Autograd and I have a question concerning the performance of AutoGrad compared to ReverseDiff.jl.

I have this basic setup:

using ReverseDiff, AutoGrad, BenchmarkTools

function f(x)
    m = length(x)
    return 100.0 * sum((x[i] - x[i - 1]^2)^2 for i=2:m) + (1.0 - x[1])^2
end

n = 2
x = [0.150369, 0.8463333]
u = [0.284309, 0.927797]

With ReverseDiff I do:

g = Array{Any}(n)
tape = ReverseDiff.compile(ReverseDiff.GradientTape(f, rand(n)))
F = x -> ReverseDiff.gradient!(g, tape, x)
@benchmark F(x) setup=(x=rand(2))

Which leads to:

BenchmarkTools.Trial: 
  memory estimate:  32 bytes
  allocs estimate:  2
  --------------
  minimum time:     527.174 ns (0.00% GC)
  median time:      539.411 ns (0.00% GC)
  mean time:        547.806 ns (0.19% GC)
  maximum time:     6.466 μs (88.03% GC)
  --------------
  samples:          10000
  evals/sample:     190

And with AutoGrad I do:

gradg = AutoGrad.grad(f, 1)
@benchmark gradg(x) setup=(x=rand(2))

Which leads to:

BenchmarkTools.Trial: 
  memory estimate:  11.31 KiB
  allocs estimate:  299
  --------------
  minimum time:     37.077 μs (0.00% GC)
  median time:      38.893 μs (0.00% GC)
  mean time:        41.757 μs (3.26% GC)
  maximum time:     2.893 ms (95.67% GC)
  --------------
  samples:          10000
  evals/sample:     1

So AutoGrad is much slower then ReverseDiff. I assume it's because I can precompile a tape with ReverseDiff, which makes it faster.

Is it possible to get a similar level of performance using AutoGrad?
Thanks!

activate Codecov

should be updated for julia 0.6

currently it fails to precompile on 0.6

ulia> using AutoGrad
INFO: Recompiling stale cache file /home/carlo/.julia/lib/v0.6/AutoGrad.ji for module AutoGrad.

WARNING: deprecated syntax "typealias Tape Vector{Node}" at /home/carlo/.julia/v0.6/AutoGrad/src/core.jl:294.
Use "const Tape = Vector{Node}" instead.
WARNING: Array{T}(::Type{T}, m::Int) is deprecated, use Array{T}(m) instead.
Stacktrace:
 [1] depwarn(::String, ::Symbol) at ./deprecated.jl:64
 [2] Array(::Type{AutoGrad.Node}, ::Int64) at ./deprecated.jl:51
 [3] #Rec#10(::Function, ::Tuple{}, ::Array{Any,1}, ::Type{T} where T, ::Void, ::Array{AutoGrad.Node,1}) at /home/carlo/.julia/v0.6/AutoGrad/src/core.jl:307
 [4] AutoGrad.Rec(::Void) at /home/carlo/.julia/v0.6/AutoGrad/src/core.jl:307
 [5] include_from_node1(::String) at ./loading.jl:539
 [6] include(::String) at ./sysimg.jl:14
 [7] include_from_node1(::String) at ./loading.jl:539
 [8] include(::String) at ./sysimg.jl:14
 [9] anonymous at ./<missing>:2
 [10] eval(::Module, ::Any) at ./boot.jl:235
 [11] process_options(::Base.JLOptions) at ./client.jl:286
 [12] _start() at ./client.jl:371
while loading /home/carlo/.julia/v0.6/AutoGrad/src/core.jl, in expression starting on line 329
WARNING: Array{T}(::Type{T}, m::Int) is deprecated, use Array{T}(m) instead.
Stacktrace:
 [1] depwarn(::String, ::Symbol) at ./deprecated.jl:64
 [2] Array(::Type{AutoGrad.Node}, ::Int64) at ./deprecated.jl:51
 [3] Type at /home/carlo/.julia/v0.6/AutoGrad/src/core.jl:289 [inlined]
 [4] #Rec#10(::Function, ::Tuple{}, ::Array{Any,1}, ::Type{T} where T, ::Void, ::Array{AutoGrad.Node,1}) at /home/carlo/.julia/v0.6/AutoGrad/src/core.jl:308
 [5] AutoGrad.Rec(::Void) at /home/carlo/.julia/v0.6/AutoGrad/src/core.jl:307
 [6] include_from_node1(::String) at ./loading.jl:539
 [7] include(::String) at ./sysimg.jl:14
 [8] include_from_node1(::String) at ./loading.jl:539
 [9] include(::String) at ./sysimg.jl:14
 [10] anonymous at ./<missing>:2
 [11] eval(::Module, ::Any) at ./boot.jl:235
 [12] process_options(::Base.JLOptions) at ./client.jl:286
 [13] _start() at ./client.jl:371
while loading /home/carlo/.julia/v0.6/AutoGrad/src/core.jl, in expression starting on line 329
WARNING: Array{T}(::Type{T}, m::Int) is deprecated, use Array{T}(m) instead.
Stacktrace:
 [1] depwarn(::String, ::Symbol) at ./deprecated.jl:64
 [2] Array(::Type{AutoGrad.Node}, ::Int64) at ./deprecated.jl:51
 [3] AutoGrad.Node(::AutoGrad.Rec{Void}) at /home/carlo/.julia/v0.6/AutoGrad/src/core.jl:289
 [4] include_from_node1(::String) at ./loading.jl:539
 [5] include(::String) at ./sysimg.jl:14
 [6] include_from_node1(::String) at ./loading.jl:539
 [7] include(::String) at ./sysimg.jl:14
 [8] anonymous at ./<missing>:2
 [9] eval(::Module, ::Any) at ./boot.jl:235
 [10] process_options(::Base.JLOptions) at ./client.jl:286
 [11] _start() at ./client.jl:371
while loading /home/carlo/.julia/v0.6/AutoGrad/src/core.jl, in expression starting on line 329

WARNING: deprecated syntax "typealias D1 Type{Grad{1}}" at /home/carlo/.julia/v0.6/AutoGrad/src/util.jl:325.
Use "const D1 = Type{Grad{1}}" instead.

WARNING: deprecated syntax "typealias D2 Type{Grad{2}}" at /home/carlo/.julia/v0.6/AutoGrad/src/util.jl:326.
Use "const D2 = Type{Grad{2}}" instead.

WARNING: deprecated syntax "typealias Dn{N} Type{Grad{N}}" at /home/carlo/.julia/v0.6/AutoGrad/src/util.jl:328.
Use "Dn{N} = Type{Grad{N}}" instead.
ERROR: LoadError: LoadError: UndefVarError: decolon not defined
Stacktrace:
 [1] include_from_node1(::String) at ./loading.jl:539
 [2] include(::String) at ./sysimg.jl:14
 [3] include_from_node1(::String) at ./loading.jl:539
 [4] include(::String) at ./sysimg.jl:14
 [5] anonymous at ./<missing>:2
while loading /home/carlo/.julia/v0.6/AutoGrad/src/interfaces.jl, in expression starting on line 99
while loading /home/carlo/.julia/v0.6/AutoGrad/src/AutoGrad.jl, in expression starting on line 18
ERROR: Failed to precompile AutoGrad to /home/carlo/.julia/lib/v0.6/AutoGrad.ji.
Stacktrace:
 [1] compilecache(::String) at ./loading.jl:673
 [2] require(::Symbol) at ./loading.jl:431

gradcheck does not work if model has Params

grad(sum)(x) fails for scalar x.

view is not supported

on julia 0.7:

julia> using AutoGrad
julia> view(rand(2,2,3),2,2,1:2)
2-element view(::Array{Float64,3}, 2, 2, 1:2) with eltype Float64:
 0.26936877599285936
 0.5965847464427165 

julia> view(Rec(rand(2,2,3)),2,2,1:2)
ERROR: MethodError: no method matching view(::Rec{Array{Float64,3}}, ::Int64, ::Int64, ::UnitRange{Int64})
Closest candidates are:
  view(::AbstractArray, ::Any...) where N at subarray.jl:132
Stacktrace:
 [1] top-level scope at none:0

Once view is supported, it will be also easy to support selectdim.

High memory usage?

It's great to see Julia's collection of AD tools grow!

I tried to take the gradient of the below function, but it seemed to hang for several minutes before Julia crashes:

julia> rosenbrock(x) = sum(map((i, j) -> (1 - j)^2 + 100*(i - j^2)^2, x[2:end], x[1:end-1]))
rosenbrock (generic function with 1 method)

julia> g = AutoGrad.grad(rosenbrock)
(::gradfun) (generic function with 1 method)

julia> x = rand(100000);

julia> @time g(x) # just sits here for several minutes, then crashes

It seems to be because of the sheer amount of memory being used. I know that reverse-mode, in general, can be a memory hog, but Python's autograd library produces this gradient in about 4 seconds on my machine, so I'm suspecting there's a bug somewhere.

Issue with special functions

on julia 0.6 and autograd master

julia> using SpecialFunctions

julia> erf(1)
0.8427007929497149

julia> using AutoGrad

julia> grad(erf)(1)
ERROR: erf(1R,) has been moved to the package SpecialFunctions.jl.
Run Pkg.add("SpecialFunctions") to install SpecialFunctions on Julia v0.6 and later,
and then run `using SpecialFunctions`.
Stacktrace:
 [1] erf(::AutoGrad.Rec{Int64}) at ./deprecated.jl:1294
 [2] forward_pass(::Function, ::Tuple{Int64}, ::Array{Any,1}, ::Int64) at /home/carlo/.julia/v0.6/AutoGrad/src/core.jl:88
 [3] (::AutoGrad.##gradfun#1#3{Base.#erf,Int64})(::Array{Any,1}, ::Function, ::Int64, ::Vararg{Int64,N} where N) at /home/carlo/.julia/v0.6/AutoGrad/src/core.jl:39
 [4] (::AutoGrad.#gradfun#2)(::Int64, ::Vararg{Int64,N} where N) at /home/carlo/.julia/v0.6/AutoGrad/src/core.jl:39
 [5] macro expansion at ./REPL.jl:97 [inlined]
 [6] (::Base.REPL.##1#2{Base.REPL.REPLBackend})() at ./event.jl:73

ambiguity error in x^n

when both x and n are integers:

julia> grad(x->x^3)(1)
ERROR: MethodError: ^(::AutoGrad.Rec{Int64}, ::Int64) is ambiguous. Candidates:
  ^(x, p::Integer) in Base at intfuncs.jl:199
  ^(x1::AutoGrad.Rec{##1045}, x2::##1046) where {##1045<:Number, ##1046<:Number} in AutoGrad
Possible fix, define
  ^(::AutoGrad.Rec{##1045<:Number}, ::Integer)
Stacktrace:
 [1] (::##13#14)(::AutoGrad.Rec{Int64}) at ./REPL[9]:1
 [2] forward_pass(::Function, ::Tuple{Int64}, ::Array{Any,1}, ::Int64) at /home/carlo/.julia/v0.6/AutoGrad/src/core.jl:88
 [3] (::AutoGrad.##gradfun#1#3{##13#14,Int64})(::Array{Any,1}, ::Function, ::Int64, ::Vararg{Int64,N} where N) at /home/carlo/.julia/v0.6/AutoGrad/src/core.jl:39
 [4] (::AutoGrad.#gradfun#2)(::Int64, ::Vararg{Int64,N} where N) at /home/carlo/.julia/v0.6/AutoGrad/src/core.jl:39
 [5] macro expansion at /home/carlo/.julia/v0.6/Revise/src/Revise.jl:775 [inlined]
 [6] (::Revise.##17#18{Base.REPL.REPLBackend})() at ./event.jl:73

StackOverflowError in grad on broadcasted function

Minimal NWE

using AutoGrad
f(x) = x.^2
jf = AutoGrad.grad(f)
jf(1)

Note that f is a scalar function for scalar inputs

Somewhat related to #67

grad doesn't work properly for ArrayFire (OpenCL background)

Code to reproduce with Julia version 1.0.1 (2018-09-29):

julia> using ArrayFire, AutoGrad
ArrayFire v3.5.1 (OpenCL, 64-bit Linux, build 0a675e8)
[0] BEIGNET: Intel(R) HD Graphics Skylake ULT GT2, 4096 MB

julia> afx=AFArray([1f0,2f0,3f0])
AFArray: 3-element Array{Float32,1}:
 1.0
 2.0
 3.0
julia> x=Param(afx)
P(AFArray{Float32,1}(3))

julia> y=@diff sum(abs2(x))
T(14.0)

julia> grad(y,x)
AFArray: 3-element Array{Float32,1}:
 2.0
 0.0
 0.0

julia> y=@diff sum(x)
T(6.0)

julia> grad(y,x)
AFArray: 3-element Array{Float32,1}:
 1.0
 0.0
 0.0

scalar-valued function error

tan(x) = begin
    y = exp(-2x)
    return (1.0 - y) ./ (1.0 + y)
end

Out: tan (generic function with 1 method)

using Knet
dtan = grad(tan)

Out: (::gradfun) (generic function with 1 method)

println(dtan(1))

Out: 0.419974341614026

println(dtan(Any[1,2]))

Out: grad requires a scalar-valued function, got [0.761594, 0.964028]

Gradient type mismatch

grad function returns Array{Float64} gradient for a Array{Float32} input parameter when the differentiated function's output is Float64. Is this expected behaviour of grad? I think that it has to be same type with input parameters, since we use those gradients for SGD updates.

The issue occurs when I upgraded AutoGrad to the latest commit: 6acb8d5

Problem:

julia> f(x) = sum(x.*0.2)
f (generic function with 1 method)

julia> gf = grad(f)
(::gradfun) (generic function with 1 method)

julia> gf(ones(Float32,5,5))
5×5 Array{Float64,2}:
 0.2  0.2  0.2  0.2  0.2
 0.2  0.2  0.2  0.2  0.2
 0.2  0.2  0.2  0.2  0.2
 0.2  0.2  0.2  0.2  0.2
 0.2  0.2  0.2  0.2  0.2

If we remove 0.2 in sum, there is no problem:

julia> f(x) = sum(x)
f (generic function with 1 method)

julia> gf = grad(f)
(::gradfun) (generic function with 1 method)

julia> gf(ones(Float32,5,5))
5×5 Array{Float32,2}:
 1.0  1.0  1.0  1.0  1.0
 1.0  1.0  1.0  1.0  1.0
 1.0  1.0  1.0  1.0  1.0
 1.0  1.0  1.0  1.0  1.0
 1.0  1.0  1.0  1.0  1.0

AutoGrad slower in Julia5 compared to Julia4

Following is an example from Knet test scripts. Julia4 is >2x faster both in the first run and in the second run of testall(). Anybody with more insight into Julia have any idea why?

using AutoGrad

function testall()
    for f in (sum, prod, maximum, minimum, sumabs, sumabs2)
        g = grad(f)
        for t in (Float32, Float64)
            for n in (1,(1,1),2,(2,1),(1,2),(2,2))
                d = g(rand(t,n))
            end
        end
    end
end

@time testall()
@time testall()

incorrect gradient when indexing into a matrix of vectors

I'm using latest master of AutoGrad.jl in Julia 0.6.3 on Windows and it gives an unexpected gradient when indexing into a matrix of vectors:

julia> a = Matrix{Array{Float32}}(2, 1)
2×1 Array{Array{Float32,N} where N,2}:
 #undef
 #undef

julia> a[1] = [2,3,4]
3-element Array{Int64,1}:
 2
 3
 4

julia> a[2] = [4,5,6]
3-element Array{Int64,1}:
 4
 5
 6

julia> g = grad(x->x[1]' * x[2])
(::gradfun) (generic function with 1 method)

julia> g(a)
2×1 Array{Any,2}:
 Float32[4.0, 5.0, 6.0]
 Float32[2.0, 3.0, 4.0]

julia> g = grad(x->x[1, 1]' * x[2, 1])
(::gradfun) (generic function with 1 method)

julia> g(a)
2×1 Array{Any,2}:
 4.0
 2.0

The two functions are identical in my opinion, but the latter one gives only the gradient of the first element in each vector.

denizyuret / autograd.jl Goto Github PK

autograd.jl's People

Contributors

Stargazers

Watchers

Forkers

autograd.jl's Issues

Recommend Projects

Recommend Topics

Recommend Org