ekmett / ad Goto Github PK

View Code? Open in Web Editor NEW

368.0 368.0 73.0 2.3 MB

Automatic Differentiation

Home Page: http://hackage.haskell.org/package/ad

License: BSD 3-Clause "New" or "Revised" License

Haskell 92.04% C 4.63% C++ 3.33%

ad's People

Contributors

Stargazers

Watchers

Forkers

acowley borgvall cartazio alpmestan yairchu christopherdavidwhite alang9 bgamari cscherrer davidjbalaban idontgetoutmuch nuim-bcl silky syllogismos allthing knotman90 dtanp hplnhm errord expipiplus1 cpehle functional-autodiff mstksg phadej ocramz takano-akio valdas1811 adscib neutronest aromazyl qianwu1995 headinthebox hardentoo kylinxu speedcell4 vishalbelsare phischu nh2 batmanabcdefg monoidal niuren sofusmortensen stjordanis msakai dannywumaster adamse noaland lemastero georgemeyer-alis tomjaguarpaw standardgalactic mikolaj konn high-cloud shimuuar andreizaluckii guuuyuu qnqbk2255018 sadc44 asdc225 asdvve2 sadc221z sadvv2 fhfb332 llawsong900 kongworldstone youms56 woetman julmb xiangjianghk patb1234

ad's Issues

Release 4.3.2 on Hackage

It would be nice to have the NoEq functions available.

fixup opening docs for Numeric.AD.Internal.Reverse

says This is asymptotically faster than using Reverse
should say This is asymptotically faster than using Kahn

i'll prep a PR (even though its trivial)

Horribly broken type signatures in Numeric.AD.Newton in 4

GHC 8 compatible release needed

ad is no longer in stackage nightly. That probably means a lot of other packages are blocked as well.

norm-squared is differentiable at zero

The answer should be 0.
Prelude Numeric.AD> diff (\x -> x **2) 0 NaN

Cool package though.

du failing on simple function

The directional derivative operator du doesn't seem to handle the simplest case.

$ ghci
GHCi, version 7.6.3: http://www.haskell.org/ghc/  :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.

Prelude> :m + Numeric.AD

Prelude Numeric.AD> let f [x0,x1,x2] = [x0*x1^2*x2^3, x0^3*x1^2*x2]

Prelude Numeric.AD> :t f
f :: Num t => [t] -> [t]

Prelude Numeric.AD> :t du f

<interactive>:1:4:
    Couldn't match type `[AD
                            s (Numeric.AD.Internal.Forward.Forward a0)]'
                  with `AD s (Numeric.AD.Internal.Forward.Forward a0)'
    Expected type: [AD s (Numeric.AD.Internal.Forward.Forward a0)]
                   -> AD s (Numeric.AD.Internal.Forward.Forward a0)
      Actual type: [AD s (Numeric.AD.Internal.Forward.Forward a0)]
                   -> [AD s (Numeric.AD.Internal.Forward.Forward a0)]
    In the first argument of `du', namely `f'
    In the expression: du f

Determining when Jacobians are sparse

To what extent is it possible for automatic differentiation to tell us when the partial derivative of a function with respect to one of its inputs is everywhere zero?

The general approach works by evaluating derivatives only at finitely many explicitly specified points.

But it seems that this analysis, or at least a very good conservative approximation of it, is syntax-directed on the grammar of math expressions that Num and Floating and friends let us construct, so maybe there is hope?

interesting complementary evaluation strategy?

this might not quite be under the purview of AD, but seems like an interesting tech that might be a sort of compensated precision finite difference in sheeps clothing? http://dl.acm.org/citation.cfm?id=2168774&CFID=181882206&CFTOKEN=50391301
Using Multicomplex Variables for Automatic Computation of High-Order Derivatives

(might alternatively be useful for compensated?)

nesting: diff (\x -> diff (*x) 1) 1

Was showing off nesting, simplest example I could think of, oops.

tldr: ad 4.3.2.1, ghc 8.0.1, diff (\x -> diff (*x) 1) 1 gets a type error.

$ cabal update
Downloading the latest package list from hackage.haskell.org
$ cabal install --user ad
Resolving dependencies...
Downloading base-orphans-0.5.4...
...
Installed free-4.12.4
Downloading ad-4.3.2.1...
Configuring ad-4.3.2.1...
Building ad-4.3.2.1...
Installed ad-4.3.2.1
$ ghci
GHCi, version 8.0.1: http://www.haskell.org/ghc/  :? for help
Prelude> :m + Numeric.AD
Prelude Numeric.AD> diff (*2) 1
2
Prelude Numeric.AD> diff sin 0
1.0
Prelude Numeric.AD> diff (\x -> diff (*x) 1) 1

<interactive>:6:20: error:
    • Occurs check: cannot construct the infinite type:
        a ~ AD s (Numeric.AD.Internal.Forward.Forward a)
      Expected type: AD
                       s1
                       (Numeric.AD.Internal.Forward.Forward
                          (AD s (Numeric.AD.Internal.Forward.Forward a)))
        Actual type: AD s (Numeric.AD.Internal.Forward.Forward a)
    • In the second argument of ‘(*)’, namely ‘x’
      In the first argument of ‘diff’, namely ‘(* x)’
      In the expression: diff (* x) 1
    • Relevant bindings include
        x :: AD s (Numeric.AD.Internal.Forward.Forward a)
          (bound at <interactive>:6:8)
        it :: a (bound at <interactive>:6:1)

This happens even without nesting.

> diff (pi*) 1
3.141592653589793
> (\x -> diff (x*) 1) pi
error: ...
> let x=pi in diff (x*) 1
3.141592653589793

and yes, I know auto makes it okay

Prelude Numeric.AD> diff (\x -> diff (auto x*) 2) 3
1
Prelude Numeric.AD> (\x -> diff (auto x*) 2) 3
3

But it's still a pretty big wart on an otherwise lovely countenance.

Nesting Benchmark

My Haskell type fu is punching some air; anyone care to have a look?

This is for a cross-language benchmark. Each language/AD system has a "common" file shared by all the benchmarks. Each benchmark is rewritten in each language/system with an attempt to preserve logic and names across implementations. (Hence, e.g., the gradient = grad line.)

There is an outer level of AD (gradient descent) and an inner level (Euler's method).

Ideally, both inner and outer could be set to use Forward / Reverse mode independently, to yield four benchmarks (FF, FR, RF, RR). Yes, in this case things are scalar or 2D meaning reverse mode won't be faster. It's a benchmark.

Would be okay to sprinkle in minor magic for speed, like strictness annotations.

The style used throughout (across language) is to avoid type declarations except where necessary, i.e., to allow automatic type inference to do its thing.

File Common_Haskell_AD.hs

{-# LANGUAGE RankNTypes #-}
{-# LANGUAGE FlexibleContexts #-}

module Common_Haskell_AD where

import Numeric.AD
import Numeric.AD.Internal.Forward (Forward)
import Data.Reflection (Reifies)
import Numeric.AD.Internal.Reverse (Tape, Reverse, primal)

lift x = auto x

derivative
  :: Num a =>
     (forall s.
      AD s (Numeric.AD.Internal.Forward.Forward a)
      -> AD s (Numeric.AD.Internal.Forward.Forward a))
     -> a -> a
derivative = diff

sqr x = x * x

vplus :: Num a => [a] -> [a] -> [a]
vplus = zipWith (+)

vminus :: Num a => [a] -> [a] -> [a]
vminus = zipWith (-)

ktimesv k = map (k *)

magnitude_squared x = foldl (+) 0 (map sqr x)

magnitude :: Floating a => [a] -> a
magnitude = sqrt . magnitude_squared

distance_squared u v = magnitude_squared (vminus u v)

distance u v = sqrt (distance_squared u v)

-- replace_ith (x : xs) 0 xi = (xi : xs)
-- replace_ith (x : xs) i xi = (x : (replace_ith xs (i - 1) xi))

replace_ith xs i xi = take i xs ++ [xi] ++ drop (i+1) xs

gradient
  :: Num a =>
     (forall s.
      Data.Reflection.Reifies s Numeric.AD.Internal.Reverse.Tape =>
      [Numeric.AD.Internal.Reverse.Reverse s a]
      -> Numeric.AD.Internal.Reverse.Reverse s a)
     -> [a] -> [a]

gradient f x = grad f x

-- lower_fs :: (Mode b, Num a) => ([b] -> Reverse s a) -> [Scalar b] -> a
-- lower_fs :: Num a => ([Reverse s a] -> Reverse s a) -> [a] -> a

-- lower_fs f xs = primal (f (map auto xs))

lower_fs
  :: Num a => (forall s. Data.Reflection.Reifies s Tape =>
               [Reverse s a] -> Reverse s a) -> [a] -> a

lower_fs f xs = fst (grad' f xs)

multivariate_argmin, multivariate_argmax ::
  (Ord a, Floating a, Fractional a) =>
  (forall s. Data.Reflection.Reifies s Tape =>
    [Reverse s a] -> Reverse s a) -> [a] -> [a]

multivariate_max ::
  (Ord a, Floating a, Fractional a) =>
  (forall s. Data.Reflection.Reifies s Tape =>
    [Reverse s a] -> Reverse s a) -> [a] -> a

multivariate_argmin f x =
    let 
        loop x fx gx eta i =
            if (magnitude gx) <= 1e-5
            then x
            else if i == 10
                 then loop x fx gx (2 * eta) 0
                 else let x_prime = vminus x (ktimesv eta gx)
                      in if (distance x x_prime) <= 1e-5
                         then x
                         else let fx_prime = lower_fs f x_prime
                              in if fx_prime < fx
                                 then
                                 loop
                                 x_prime fx_prime (gradient f x_prime) eta       (i + 1)
                                 else
                                 loop
                                 x       fx       gx          (eta / 2) 0
    in loop x (lower_fs f x) (gradient f x) 1e-5 0

multivariate_argmax f x = multivariate_argmin (\ x -> - (f x)) x

multivariate_max f x = (lower_fs f) (multivariate_argmax f x)

File particle-XX-haskell-ad.hs

{-# LANGUAGE AllowAmbiguousTypes #-}

import Common_Haskell_AD

naive_euler w =
    let charges = [[10, 10 - lift w], [10, 0]]
        x_initial = [0, 8]
        xdot_initial = [0.75, 0]
        delta_t = 1e-1
        p x = sum (map (recip . distance x) charges)
        loop x @ [_, x_1] xdot @ [_, xdot_1] =
            let xddot = ktimesv (-1) (gradient p x)
                x_new @ [_, x_new_1] = vplus x (ktimesv delta_t xdot)
            in if x_new_1 > 0
               then loop x_new (vplus xdot (ktimesv delta_t xddot))
               else let delta_t_f = - x_1 / xdot_1
                        [x_t_f_0, _] = vplus x (ktimesv delta_t_f xdot)
                    in sqr x_t_f_0
    in loop x_initial xdot_initial

run = let w0 = 0
          [w_star] = multivariate_argmin (\ [w] -> naive_euler w) [w0]
      in w_star

main = print run

Use `diffs` instead of `diffs0` in `findZero`?

Can we use this to drop Eq constraints?

http://hackage.haskell.org/package/ad-4.3/docs/src/Numeric-AD-Rank1-Halley.html#findZero

zero, NaN, zero?

What's that big stripe of NaNs, and why does it go back to zeros?

*Main Numeric.AD> take 1000 $ diffs (**3) (5::Double)
[125.0,75.0,30.0,6.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,NaN,NaN,NaN,
NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,
NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,
NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,
NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,
NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,
NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,
NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,
NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,
NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,
NaN,NaN,NaN,NaN,NaN,NaN,NaN,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]

Gradient descent fails to converge (to the right result).

Consider the function,

let f [m,b] = (6020.272727m+b-23680.30303)^(2::Int) + (7254.196429m+b-28807.61607)^(2::Int) + (6738.575342m+b-26582.76712)^(2::Int) + (5464.658537m+b-23894.34756)^(2::Int)

It's a smooth convex function and gradient descent should converge to [m,b] = [2.911525576,7196.512447]. See Wolfram Alpha.

Instead,

(!! 10) $ gradientDescent f [0,0]

[6.503543385615585,1.0522541316691529e-3]

(!! 100) $ gradientDescent f [0,0]

[4.074543700217947,1.0251230667415617e-3]

(!! 1000) $ gradientDescent f [0,0]

[4.028579379271611,4.621971888645496e-3]

(!! 10000) $ gradientDescent f [0,0]

[4.02857388945333,4.100383031651195e-2]

miscalculating grads

grads appears to miscalculate derivatives of degree higher than 2. For example,

> let f = \[x] -> x^3
> let cs = [1]
> headJet $ jet (grads f cs)
1
> headJet $ tailJet $ jet (grads f cs)
[3]
> headJet $ tailJet $ tailJet $ jet (grads f cs)
[[6]]
> headJet $ tailJet $ tailJet $ tailJet $ jet (grads f cs)
[[[2]]]
> headJet $ tailJet $ tailJet $ tailJet $ tailJet $ jet (grads f cs)
[[[[0]]]]

Compare diffs.

> diffs (\x -> x^3) 1
[1,3,6,6]

Fix ComposeMode Data/Typeable

Using AD with exceptions

I tried using AD with a function that has a Typeable constraint and ran into Could not deduce (Typeable s) [..] from: Reifies s Tape [..].

This seems very similar to the problem described in ekmett/reflection#14.
Am I correct in assuming that this comes from the usage of reifyTape in the AD internals and could be fixed by using something closer to the reifyTypeable like in the other issue?

'cabal install AD' fails when installing semigroupoids

When running 'cabal install AD' on ubuntu 14.04.2 with cabal 1.16.0.2:

src/Data/Semigroup/Foldable/Class.hs:103:10:
    No instance for (Bifoldable (Tagged *))
    arising from the superclasses of an instance declaration
    Possible fix:
    add an instance declaration for (Bifoldable (Tagged *))
    In the instance declaration for `Bifoldable1 (Tagged *)'
Failed to install semigroupoids-5.0.0.2
cabal: Error: some packages failed to install:
ad-4.2.2 depends on semigroupoids-5.0.0.2 which failed to install.
free-4.12.1 depends on semigroupoids-5.0.0.2 which failed to install.
semigroupoids-5.0.0.2 failed during the building phase. The exception was:
ExitFailure 1

Inefficiency caused by current Mode scheme

The current entangling of the concepts of Mode and infinitesimal mean that every user of the AD library has to dispatch virtually everything that want to do with an AD variable via a dictionary. While this provides unusually strong guarantees, that only the properties that AD s a inherits from Mode s and the underlying numeric type a can be used, it means that the resulting code can be considerably slower than would be otherwise possible.

We could adjust AD to have separate mode and infinitesimal attributes, and adjust UU, UF, FU, FF to have MUU MUF MFU MFF variants that accept a mode, so that most of the combinators do the right thing. The cost of this is that we need to be more careful about all the extraneous instances that our Modes have picked up, and which properties we allow AD to inherit.

To obtain something approximating the current level of safety, we'll have to remove a lot of current instances and hide the AD data constructor.

Add `erf` support and perhaps an example of how to bundle a function with its derivatives.

From an email:

An example of how to "hot wire" a simple scalar numeric function (in
this case defined externally) with its derivative for use in AD would be
a nice example in the AD package.  So making this simple/clear seems
like it would be useful documentation.

Anyway, I'm leaving out the magic below.
Might need to use template magic, since that's how things like `atan` work.
That makes it double magical of course.

module Numeric.AD.Erf () where

import Data.Number.Erf
import Numeric.AD
import Numeric.AD.Internal

-- derf :: Erf a => a -> a -- or just Floating
derf x = (2 / sqrt pi)*exp(- x*x)

-- dinverf :: (InvErf a, Erf a) => a -> a  -- or just Floating
dinverf z x = recip $ derf x

MAGIC GOES HERE
 (Erf a, Mode t) => Erf (t a)
   MORE MAGIC
 erf1 = lift1 erf $ derf

MORE OF THE SAME MAGIC
 (InvErf a, Mode t) => InvErf (t a) -- maybe also add Erf a on the lhs
 inverf1 = lift1_ inverf $ dinverf

How can I add custom gradients?

If I want to give a custom or known gradient for a function, how can I do that in this library? (I don't want to autodifferentiate through this function.) I am using the grad function.

If the library doesn't provide this feature, is there some way I can easily implement this functionality myself, perhaps by changing the definitions of leaf nodes or by editing the dual numbers that presumably carry the numerical gradients?

Here's a concrete example of what I mean:

Say I have some function I want to take the gradient of, say f(x, y) = x^2 + 3 * g(x, y)^2. Then say that g(x, y) is a function whose definition is complicated and involves lots of Haskell code, but whose gradient I've already calculated analytically and is quite simple. Thus, when I take grad f and evaluate it at a point (x, y), I'd like to just plug in my custom gradient for g, instead of autodiffing through it: something like my_nice_grad_of_g (x, y).

I see other autodiff libraries do provide this feature, for example Stan and Tensorflow both allow users to define gradients of a function.

Thanks!

Could not deduce (Semigroup (Id a))

Unexpected difference in behavior of `grad` for different modes?

Testing out gradient of the simple function x**y along the x axis at y /= 0, I get different answers depending on the mode:

> import qualified Numeric.AD.Mode.Sparse as S
> S.grad (\[x,y] -> x**y) [0,2]
[NaN, NaN]
> import qualified Numeric.AD.Mode.Reverse as R
> R.grad (\[x,y] -> x**y) [0,2]
[NaN, NaN]
> import qualified Numeric.AD.Mode.Kahn as K
> K.grad (\[x,y] -> x**y) [0,2]
[NaN, 0.0]

Only the forward mode yields the mathematically "correct" answer:

> import qualified Numeric.AD.Mode.Forward as F
> F.grad (\[x,y] -> x**y) [0,2]
[0.0, NaN]

Which of the three answers is the expected behavior? And, are all three supposed to give different answers for something like this?

Matrix algebra support?

Does this package support matrix algebra operations? For instance, does this support HMatrix, Repa or Accelerate or some other way of getting gradients for functions involving matrices? I'm looking for a Haskell alternative for Python autograd package.

Forward : Wengert :: Speelpenning : Reverse

Even though "Wengert list" is common terminology for a topological sort of the flow graph, I think Wengert Mode has the wrong association here. Robert Edwin Wengert invented (or discovered, or rediscovered if you read Russian) forward mode. Bert Speelpenning invented (or discovered or rediscovered if you read Werbos or Bryson & Ho or Pontryagin) reverse mode. And I suppose Babbage or Lovelace probably invented serialization of the dynamic trace of an instruction stream... Point is, naming a Reverse Mode subtype "Wengert" mode is really confusing. I would suggest PontryaginDiscretized since the list at issue is precisely the discrete version of his continuous differential equation.

Jacobian fails when function is a parameter

Hi!

I'm trying to automatize the creation of a jacobian for any function f with the following type:

f :: Floating a => [a] -> [a] -> [a]

If I define f beforehand, then everything works ok:

jac = jacobian (\params -> expModel params [1,1])

and I can then apply jac to any params.

But if I have:

jac f = jacobian (\params -> f params [1,1])

then it breaks:

<interactive>:42:33: error:
    • Couldn't match expected type ‘f (Numeric.AD.Internal.Reverse.Reverse
                                         s a)
                                    -> [Integer] -> g (Numeric.AD.Internal.Reverse.Reverse s a)’
                  with actual type ‘p’
        because type variable ‘s’ would escape its scope
      This (rigid, skolem) type variable is bound by
        a type expected by the context:
          forall s.
          reflection-2.1.4:Data.Reflection.Reifies
            s Numeric.AD.Internal.Reverse.Tape =>
          f (Numeric.AD.Internal.Reverse.Reverse s a)
          -> g (Numeric.AD.Internal.Reverse.Reverse s a)
        at <interactive>:42:12-47
    • In the expression: f params [1, 1]
      In the first argument of ‘jacobian’, namely
        ‘(\ params -> f params [1, 1])’
      In the expression: jacobian (\ params -> f params [1, 1])
    • Relevant bindings include
        params :: f (Numeric.AD.Internal.Reverse.Reverse s a)
          (bound at <interactive>:42:23)
        f :: p (bound at <interactive>:42:8)
        autjac :: p -> f a -> g (f a) (bound at <interactive>:42:1)

Same thing if I try to parametrize x:

> x = jacobian (\params -> expModel params x) 


<interactive>:55:45: error:
    • Couldn't match expected type ‘[Numeric.AD.Internal.Reverse.Reverse
                                       s a]’
                  with actual type ‘p’
        because type variable ‘s’ would escape its scope
      This (rigid, skolem) type variable is bound by
        a type expected by the context:
          forall s.
          reflection-2.1.4:Data.Reflection.Reifies
            s Numeric.AD.Internal.Reverse.Tape =>
          [Numeric.AD.Internal.Reverse.Reverse s a]
          -> [Numeric.AD.Internal.Reverse.Reverse s a]
        at <interactive>:55:8-46
    • In the second argument of ‘expModel’, namely ‘x’
      In the expression: expModel params x
      In the first argument of ‘jacobian’, namely
        ‘(\ params -> expModel params x)’
    • Relevant bindings include
        params :: [Numeric.AD.Internal.Reverse.Reverse s a]
          (bound at <interactive>:55:19)
        x :: p (bound at <interactive>:55:4)
<interactive>:55:45: error:
    • Couldn't match expected type ‘[Numeric.AD.Internal.Reverse.Reverse
                                       s a]’
                  with actual type ‘p’
        because type variable ‘s’ would escape its scope
      This (rigid, skolem) type variable is bound by
        a type expected by the context:
          forall s.
          reflection-2.1.4:Data.Reflection.Reifies
            s Numeric.AD.Internal.Reverse.Tape =>
          [Numeric.AD.Internal.Reverse.Reverse s a]
          -> [Numeric.AD.Internal.Reverse.Reverse s a]
        at <interactive>:55:8-46
    • In the second argument of ‘expModel’, namely ‘x’
      In the expression: expModel params x
      In the first argument of ‘jacobian’, namely
        ‘(\ params -> expModel params x)’
    • Relevant bindings include
        params :: [Numeric.AD.Internal.Reverse.Reverse s a]
          (bound at <interactive>:55:19)
        x :: p (bound at <interactive>:55:4)
        g2 :: p -> [a] -> [[a]] (bound at <interactive>:55:1)

        g2 :: p -> [a] -> [[a]] (bound at <interactive>:55:1)

What am I doing wrong?
Thanks for the help!

Multi-variate root finder?

Are there any plans for a multi-variate Newton-Raphson or Halley root finder? If not, I might contribute one.

We should rename 'lift' now that it is exported by default.

Two associated pattern synonyms (KnownConstant / KnownZero) for Mode

Worth noting, it seems like we can replace isKnownConstant / auto and isKnownZero / zero with pattern synonyms once we get associated pattern synonyms (and we can also default type Scalar t = t)

class (Num t, Num (Scalar t)) => Mode t where
  type Scalar t
  type Scalar t = t

  pattern KnownConstant :: Scalar s -> s
  pattern KnownZero     :: s
  pattern KnownZero     = KnownConstant 0

  (*^) :: Scalar t -> t -> t
  a *^ b = KnownConstant a * b

  (^*) :: t -> Scalar t -> t
  a ^* b = a * KnownConstant b

  (^/) :: Fractional (Scalar t) => t -> Scalar t -> t
  a ^/ b = a ^* recip b

And the old

auto :: Mode s => Scalar s -> s
auto = KnownConstant

zero :: Mode s => s
zero = auto' 0

isKnownConstant :: Mode s => s -> Bool
isKnownConstant KnownConstant{} = True
isKnownConstant _               = False

isKnownZero :: Mode s => s -> Bool
isKnownZero KnownZero = True
isKnownZero _         = False

while also giving us

getKnownConstant :: Mode s => s -> Maybe (Scalar s)
getKnownConstant (KnownConstant n) = Just n
getKnownConstant _                 = Nothing

These could be defined in terms of Prisms but that rules out deriving.

Chain is a pretty bad name for the use of Wengert lists.

4.2 compilation failure

Preprocessing library ad-4.2...

src/Numeric/AD/Halley.hs:32:18:
    Could not find module `Numeric.AD.Rank1.Halley'
    Use -v to see a list of the files searched for.
Failed to install ad-4.2
cabal: Error: some packages failed to install:
ad-4.2 failed during the building phase. The exception was:
ExitFailure 1

Vectored AD

It would be nice to invert our current control mechanism so that instead of having

grad :: (Traversable f, Num a) => (forall s. Mode s => f (AD s a) -> AD s a) -> f a -> f a

we could have (also factoring in issue 1):

type RAD = AD Reverse

grad :: (Traversable f, Num a) => (forall s. RAD f s a -> RAD Scalar s a) -> f a -> f a

This can then be used to allow us to unify instance heads for the various UU, UF, FU, FF arguments, allow vgrad to take multiple 'containers' worth of AD variables, and it can act as a path to support Taylor Series-of-Matrices rather than Matrices-of-Taylor Series for better ability to leverage traditional BLAS packages, etc.

doctest errors with GHC-8.1.20170123

### Failure in /home/ogre/Documents/kmettverse/ad/src/Numeric/AD/Rank1/Kahn.hs:208: expression `hessianF (\[x,y] -> [x*y,x+y,exp x*cos y]) [1,2]'
expected: [[[0.0,1.0],[1.0,0.0]],[[0.0,0.0],[0.0,0.0]],[[-1.1312043837568135,-2.4717266720048188],[-2.4717266720048188,1.1312043837568135]]]
 but got: [[[*** Exception: Data.Array.Base.bOOL_WORD_SCALE: Overflow
          CallStack (from HasCallStack):
            error, called at libraries/array/Data/Array/Base.hs:1363:17 in array-0.5.1.2:Data.Array.Base

In total: Examples: 101 Tried: 97 Errors: 0 Failures: 20

AD with Accelerate types

It would be nice to be able to use AD with Accelerate types (like Exp Double or Exp (Complex Double))

I think the big issues are:

A lot of typeclass methods on accelerate types result in bottoms. e.g.

instance P.Eq (Exp a) where
  (==) = preludeError "Eq.==" "(==)"
  (/=) = preludeError "Eq./=" "(/=)"

instance Ord a => P.Ord (Exp a) where
  compare = error "Prelude.Ord.compare applied to EDSL types"
  (<)     = preludeError "Ord.<"  "(<)"
  (<=)    = preludeError "Ord.<=" "(<=)"
  (>)     = preludeError "Ord.>"  "(>)"
  (>=)    = preludeError "Ord.>=" "(>=)"
  min     = min
  max     = max

instance P.Enum (Exp a) where
  toEnum   = preludeError "toEnum"
  fromEnum = preludeError "fromEnum"

instance (Num a, Ord a) => P.Real (Exp a) where
  toRational = P.error "Prelude.toRational not supported for Accelerate types"

Accelerate appears to be very slow at compiling code with recursive functions. It seems better to iterate with Data.Array.Accelerate.while or Data.Array.Accelerate.iterate.

weird stuff with partial function application to lists?

Hi, I'm new to Haskell, but I've asked around on some of the freenode communities and nobody there can help me. This may not be a bug, but simply my inexperience.

f :: [a] -> a
f = [u, v] -> (v - (u * u * u))

grad f [1, 1] works just fine in GHCI

map (\x -> grad x) [f] gives the following error:
Couldn't match type ‘Integer’
with ‘Numeric.AD.Internal.Reverse.Reverse s a’
Expected type: [Numeric.AD.Internal.Reverse.Reverse s a]
-> Numeric.AD.Internal.Reverse.Reverse s a
Actual type: [Integer] -> Integer
• In the first argument of ‘grad’, namely ‘f’
In the expression: grad f
In the first argument of ‘map’, namely ‘(\ f -> grad f)’
• Relevant bindings include
it :: [[a] -> [a]] (bound at :6:1)

I've tried many different ways of partially applying grad to a list of functions but cannot get it to typecheck.

Add stochastic gradient descent

http://idontgetoutmuch.wordpress.com/2013/04/30/logistic-regression-and-automated-differentiation-3/

No documentation on hackage for 4.3.1

efficiency of tower mode

$ ghci
> :m + Numeric.FAD Numeric.AD.Mode.Tower
> Numeric.FAD.diffs exp 1

[2.718281828459045, 2.718281828459045, 2.718281828459045, 2.718281828459045, 2.718281828459045, 2.718281828459045, 2.718281828459045, 2.718281828459045, 2.718281828459045, 2.718281828459045, 2.718281828459045, 2.718281828459045, 2.718281828459045, 2.718281828459045, 2.718281828459045, 2.718281828459045, 2.718281828459045, 2.718281828459045, 2.718281828459045, 2.718281828459045, 2.718281828459045, 2.718281828459045...scrolls endlessly at lightning speed until interrupted...]

> Numeric.AD.Mode.Tower.diffs exp 1

[2.718281828459045, 2.718281828459045, 2.718281828459045, 2.718281828459045, 2.718281828459045, 2.718281828459045, 2.718281828459045, 2.718281828459045, 2.718281828459045, 2.718281828459045, 2.718281828459045...slower and slower for three lines or so, then freezes, machine starts to thrash, takes a few min to respond to C-c...]

Generate Specialized Code for known objective functions (recording prior dev discussion on this topic)

One long standing idea that's been discussed is generating more specialized / optimized code when the expression to be evaluated is known.

An example could be generating specialized gradient code for a known objective function, etc.

Theres three main strategies for doing so:

a) template haskell, which would likely be a bit tricky to write and make easy to use.
b) a ghc compiler plugin, which if written well, would work for normal AD code as is, or at least will do so some fraction of the time.
c) alternatively, some sort of run time code generation scheme via llvm might be possible.

derivative-taking or derivative-using functions are not first class

Part of the goal of Automatic Differentiation is to make taking derivatives easy. In a functional language this means that that functions for taking derivatives, or functions that use them, should be first class.

In another issue, I wrote something like this:

*Numeric.AD Data.Function> take 20 $ zipWith (-) (diffs (**(1/2)) 4) (diffs (**0.5) 4)
[0.0,-0.4571067811865476,NaN,5.7079671162651446e-2,NaN,6.408282648826724e-3,NaN,9.91344451904307e-3,NaN,3.0205026268959045e-2,NaN,0.15244099195115268,NaN,1.1504531111313554,NaN,12.133685156463514,NaN,170.44035868219845,NaN,3075.9158480928004]
*Numeric.AD Data.Function> take 20 $ zipWith (-) (diffs sqrt 4) (diffs (**(1/2)) 4)
[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,-9.094947017729282e-13]

As a functional programmer, the redundancy bothers me. I would really want to write this instead:

*Numeric.AD Data.Function> take 20 $ on (zipWith (-)) (flip diffs 4) sqrt (**(1/2))

<interactive>:46:43:
Couldn't match expected type `forall s.
                  Numeric.AD.Internal.Tower.Tower a0 s
                  -> Numeric.AD.Internal.Tower.Tower a0 s'
        with actual type `a1 -> a1'
In the third argument of `on', namely `sqrt'
In the second argument of `($)', namely
  `on (zipWith (-)) (flip diffs 4) sqrt (** (1 / 2))'
In the expression:
  take 20 $ on (zipWith (-)) (flip diffs 4) sqrt (** (1 / 2))

<interactive>:46:49:
Couldn't match expected type `forall s.
                  Numeric.AD.Internal.Tower.Tower a0 s
                  -> Numeric.AD.Internal.Tower.Tower a0 s'
        with actual type `a2 -> a2'
In the fourth argument of `on', namely `(** (1 / 2))'
In the second argument of `($)', namely
  `on (zipWith (-)) (flip diffs 4) sqrt (** (1 / 2))'
In the expression:
  take 20 $ on (zipWith (-)) (flip diffs 4) sqrt (** (1 / 2))

Inlining the on does not solve the problem:

*Numeric.AD Data.Function> take 20 $ zipWith (-) (flip diffs 4 sqrt) (flip diffs 4 (**(1/2)))

<interactive>:47:37:
Couldn't match expected type `forall s.
                  Numeric.AD.Internal.Tower.Tower a0 s
                  -> Numeric.AD.Internal.Tower.Tower a0 s'
        with actual type `a1 -> a1'
In the third argument of `flip', namely `sqrt'
In the second argument of `zipWith', namely `(flip diffs 4 sqrt)'
In the second argument of `($)', namely
  `zipWith (-) (flip diffs 4 sqrt) (flip diffs 4 (** (1 / 2)))'

<interactive>:47:58:
Couldn't match expected type `forall s.
                  Numeric.AD.Internal.Tower.Tower a0 s
                  -> Numeric.AD.Internal.Tower.Tower a0 s'
        with actual type `a2 -> a2'
In the third argument of `flip', namely `(** (1 / 2))'
In the third argument of `zipWith', namely
  `(flip diffs 4 (** (1 / 2)))'
In the second argument of `($)', namely
  `zipWith (-) (flip diffs 4 sqrt) (flip diffs 4 (** (1 / 2)))'

Further inlining the flip brings us back to the original expression. The algebraic rules which we would like to rely on---here, the ability to abstract---do not hold. In this sense, diffs and any other function which takes derivatives (even internally) are not ''first class functions'' in that they do not obey the algebra we expect function to.

I understand that this is due to the types involved. It seems unlikely that this can be fully addressed in Haskell, at least without augmentation of the type system. But I wanted to record the issue where users can find it, and an issue here seems like the best place.

Feasibility finder often loops on Newton constrainedDescent

import Data.Functor.Identity
import Numeric.AD (auto)
import Numeric.AD.Newton

ex :: [ (Double , Identity Double) ]
ex = constrainedDescent (const (auto 1)) cs (Identity 18.5)
  where
    cs = [ CC \(Identity x) -> auto 0.0 - x
         , CC \(Identity x) -> x - auto 10.0
         ]

The above is simply trying to find a feasible value between 0 and 10, with a starting guess of 18.5.

It works for starting values below 18.4, but as soon as I try anything 18.5 or above, it just loops indefinitely.

I tried digging through the code some, and if I change that nrSteps = [take 20 | _ <- [1..length tValues]] ++ [id] line in constrainedConvex' to take 60 steps instead of 20, I have some more success, but I can still get failures just by increasing my starting guess a bit further.

I'm not entirely familiar with what's going on, but it looks like the search attempts often converge on infeasible solutions. I think it's because the earlier attempts try smaller steps, but bail out if a solution is not found quickly, while later attempts are taking such large steps that they converge on something outside the bounds entirely. Perhaps it would make sense to re-try smaller steps if the big steps converge, starting from that point?

Double specific implementation of Newton?

When using Newton.findZero I'm getting results in about 100ns. If I take the source code and change out diff' for Forward.Double.diff' I get results in 6ns - a substantial difference. Is there any thought about either having a .Double module of this, or maybe a parameterized version that takes diff' as an argument?

(**) higher-order derivatives

This is against the current master branch tip.

> take 10 $ diffs (\x->x^1) 1
[1,1]
> take 10 $ diffs (\x->x**1) 1
[1.0,1.0,Infinity,NaN,NaN,NaN,NaN,NaN,NaN,NaN]

Glork.

:t diff . diff

Consider

diff2 = diff . diff

Fill in the type:

diff2 :: ???

This would be right, but we'd need to use a fiddled-type variant of diff in the definition

diff2 :: Num a =>
      (forall (s1 :: * -> *).
       (forall (s2 :: * -> *).
        (Mode s1, Mode s2) => AD s1 (AD s2 a) -> AD s1 (AD s2 a)))
      -> a -> a

The "issue" is that these types drive you crazy making it impractical to use this library when defining even seemingly-simple differential operators.

Test suite floating point accuracy failure when building with non-glibc-libc

Hey Ed,

for the advancement of static linking in Haskell (https://github.com/nh2/static-haskell-nix and NixOS/nixpkgs#43795) I'm building all Stackage executables statically, which means building against musl libc instead of glibc.

I noticed the following test failures in ad:

/tmp/nix-build-ad-4.3.5.drv-0/ad-4.3.5/src/Numeric/AD.hs:201: failure in expression `hessianF (\[x,y] -> [x*y,x+y,exp x*cos y]) [1,2]'
expected: [[[0.0,1.0],[1.0,0.0]],[[0.0,0.0],[0.0,0.0]],[[-1.1312043837568135,-2.4717266720048188],[-2.4717266720048188,1.1312043837568135]]]
 but got: [[[0.0,1.0],[1.0,0.0]],[[0.0,0.0],[0.0,0.0]],[[-1.1312043837568138,-2.471726672004819],[-2.471726672004819,1.1312043837568138]]]

/tmp/nix-build-ad-4.3.5.drv-0/ad-4.3.5/src/Numeric/AD/Mode/Kahn.hs:194: failure in expression `hessianF (\[x,y] -> [x*y,x+y,exp x*cos y]) [1,2]'
expected: [[[0.0,1.0],[1.0,0.0]],[[0.0,0.0],[0.0,0.0]],[[-1.1312043837568135,-2.4717266720048188],[-2.4717266720048188,1.1312043837568135]]]
 but got: [[[0.0,1.0],[1.0,0.0]],[[0.0,0.0],[0.0,0.0]],[[-1.1312043837568138,-2.471726672004819],[-2.471726672004819,1.1312043837568138]]]

/tmp/nix-build-ad-4.3.5.drv-0/ad-4.3.5/src/Numeric/AD/Rank1/Kahn.hs:208: failure in expression `hessianF (\[x,y] -> [x*y,x+y,exp x*cos y]) [1,2]'
expected: [[[0.0,1.0],[1.0,0.0]],[[0.0,0.0],[0.0,0.0]],[[-1.1312043837568135,-2.4717266720048188],[-2.4717266720048188,1.1312043837568135]]]
 but got: [[[0.0,1.0],[1.0,0.0]],[[0.0,0.0],[0.0,0.0]],[[-1.1312043837568138,-2.471726672004819],[-2.471726672004819,1.1312043837568138]]]

/tmp/nix-build-ad-4.3.5.drv-0/ad-4.3.5/src/Numeric/AD/Mode/Reverse.hs:205: failure in expression `hessianF (\[x,y] -> [x*y,x+y,exp x*cos y]) [1,2]'
expected: [[[0.0,1.0],[1.0,0.0]],[[0.0,0.0],[0.0,0.0]],[[-1.1312043837568135,-2.4717266720048188],[-2.4717266720048188,1.1312043837568135]]]
 but got: [[[0.0,1.0],[1.0,0.0]],[[0.0,0.0],[0.0,0.0]],[[-1.1312043837568138,-2.471726672004819],[-2.471726672004819,1.1312043837568138]]]

Examples: 105  Tried: 105  Errors: 0  Failures: 4

I suspect that these numbers rely on implementation details in libm as provided by the libc, for example how sin and cos are implemented example of a recent change in glibc.

Possibly also relevant is fused-multiply-add (FMA), which means the results can differ depending on the processor.

Would it be possible to implement these checks against some epsilon instead of exact values so that they will pass no matter in what environment they are run?

Readme refers to "tensors" but it is no more

Was it renamed to jet?

multiple applications of grad giving incorrect result

I see this problem in multidimensional functions but here is a simple one dimensional case.

This one liner should be [8.0]

grad (sum . map (\x -> x*x) . grad (\[x] -> x**2)) [1]

Instead I get [48.0]. Using diff I get the right answer.

diff ((sum . map (\x -> x*x) . grad (\[x] -> x**2)) . (:[])) 1

Any idea what is going on here?

Stack overflows on some large problems

From Lennart Augustsson,

Hi,

I'm trying to use your AD package for some fairly serious computations.
I'm using grad' to compute the partial derivatives for a function of about 800 variables.
The computation itself take 10-20s.
A problem I frequently experience is that I get a stack overflow.
I think it happens while the partial derivatives are evaluated rather than the regular computation, because using Double rather than AD numbers behaves well.

It's not a serious problem, but maybe you want to take a look.

And no, I can't send you the code. It's computing PV and delta risk for some actual products.

Thanks for a great package!

Missing instances.h

Compiling ad-4.0:

Preprocessing library ad-4.0...

src/Numeric/AD/Internal/Dense.hs:187:0:
fatal error: instances.h: No such file or directory

conjugateGradient used to give the wrong answer, now it gives no answer...

The below gave the wrong numeric result in 3.4. Now in 4.0 it gives a type error, as show.

$ ghci
GHCi, version 7.6.3: http://www.haskell.org/ghc/  :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Prelude> :m + Numeric.AD
Prelude Numeric.AD> conjugateGradientDescent (\[x,y]->x^2+y^2/2) [10,10]

<interactive>:3:42:
    Could not deduce (Fractional t) arising from a use of `/'
    from the context (Numeric.AD.Internal.Classes.Mode t,
                      Double ~ Numeric.AD.Internal.Classes.Scalar t,
                      Num t)
      bound by a type expected by the context:
                 (Numeric.AD.Internal.Classes.Mode t,
                  Double ~ Numeric.AD.Internal.Classes.Scalar t, Num t) =>
                 [t] -> t
      at <interactive>:3:1-52
    Possible fix:
      add (Fractional t) to the context of
        a type expected by the context:
          (Numeric.AD.Internal.Classes.Mode t,
           Double ~ Numeric.AD.Internal.Classes.Scalar t, Num t) =>
          [t] -> t
    In the second argument of `(+)', namely `y ^ 2 / 2'
    In the expression: x ^ 2 + y ^ 2 / 2
    In the first argument of `conjugateGradientDescent', namely
      `(\ [x, y] -> x ^ 2 + y ^ 2 / 2)'

Bug in conjugateGradientDescent

Using vector, you can write the Rosenbrock function as

f xs = (1 - x) ** 2 + 100 * (y - x**2)**2
  where
  x = xs ! 0
  y = xs ! 1

(The global min is at (1,1))

Then

gradientDescent f $ fromList [0,0]
  = [fromList [0.2,0.0]
  , fromList [0.19,5.000000000000002e-2]
  , fromList [0.2067275,3.2624999999999994e-2]
  , fromList [0.21141771788025895,4.526407407031251e-2]
  , ...
  ]

but

conjugateGradientDescent f $ fromList [0.0]
  = [fromList [0.0,0.0],fromList [0.16126202313958898,0.0]]