The inferopt.jl from juliadecisionfocusedlearning

Sample average approximation in perturbed model

It would be nice to be able to fix the random samples once and for all.
Once solution would be to store the RNG within the Perturbed object: see https://docs.julialang.org/en/v1/stdlib/Random/

Keyword arguments for `SPOPlusLoss`

We should be able to provide keywords arguments as input of the SPOPlusLoss, in a similar way as for the FenchelYoungLoss. Also needs to be implemented for SSVMLoss.

Other solvers than FW for RegularizedGeneric

Let users supply their own

Docs revamp

High priority:

Use https://github.com/JuliaDocs/DocStringExtensions.jl
Background page for math (optim + ML), focus on discrete functions rather than linear programs
Guide for choosing the right tool (as in https://scikit-learn.org/stable/tutorial/machine_learning_map/index.html)
Lots of tutorials (remove GridGraphs from the graph tuto)
Separate API between specific problems, losses, optim layers and internals
Specify private API

Low priority:

Way to test without gradient descent, or to avoid testing the full tensor problem * layers
Flux or Lux?

Compatibility with ForwardDiff

At the moment, InferOpt uses ChainRulesCore for rrules, but it would be nice to be compatible with ForwardDiff dual numbers

Energy-based Fenchel-Young loss

https://arxiv.org/abs/2205.09589

Hello I have a task to create a graph from the output of the convolutional neural network that then will be used in graph neural network.
I have a cost function that can tell wheather graph was well created.
My main problem is that graph by its nature is discrete hence constructing graph creation algorithm in a way that will enable back propagation is problematic (as you see in my architecture gradients need to pass from graph neural networks to CNN)

Can it be potentially possible to use your package for differentiable graph creation from 3D array?

Add `JET.test_package` to test suite

https://aviatesk.github.io/JET.jl/stable/jetanalysis/#JET.test_package

Compatibility with Enzyme

As of today, Enzyme.jl allows custom rules, written in a different format than ChainRules.jl. It would be nice to test InferOpt.jl with this new autodiff backend

Adaptive line search for Frank-Wolfe breaks tests

Frank-Wolfe related tests fail due to the linear solver for implicit differentiation not converging

Main suspect: ZIB-IOL/FrankWolfe.jl#387

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

Remove dispatch on `Real`

Currently, most of our code deals with values $y$ or $\theta$ that are typed as AbstractArray{<:Real}. This restriction is not used for dispatch, so it is not essential, and may even prevent the use of other interesting number formats that do not subtype Real. I suggest we remove it

Add tests checking rrules gradients against finite differences

We should add some tests checking that gradients computed by our rrules are correct.
For this, we can use FiniteDifferences.jl or ChainRulesTestUtils.jl.

Implement generalized imitation losses

We should see if we can easily implement generalized imitation losses as described in the paper.

Compatibility with Julia 1.6

It would be nice to be compatible with Julia's Long Term Support (LTS) version 1.6.
I think the main obstacle at the moment is the use of the destructuring syntax (; a, b) = x, which was introduced in Julia 1.7. Shouldn't be hard to fix, and the test CI needs to be updated too.

Add option for parallelization of Perturbed maximizers

It would be interesting to be able to run perturbed maximizers in parallel, especially when nb_samples is high or/and the combinatorial algorithm has a long runtime.

One option would be to use ThreadsX.jl. For this, we need to wait until the following PR is merged: tkf/ThreadsX.jl#195 in order to not have conflict with SetField version.

Tests are toooooooo long

Quick tests are nicer for test-driven development

Missing factor in the PerturbedMultiplicative perturbation

According to the paper, θ_perturbed = θ .* exp.(ε .* Z .- ε^2) (here) should be θ_perturbed = θ .* exp.(ε .* Z .- ε^2 / 2) instead

Better documentation

We urgently need a better documentation with

a better intro to the theory
examples for all types of loss functions
examples for using kwargs to pass instances (#56)
a guide for choosing your layer à la scikit-learn

Update documentation using Documenter v1.0

Perturbed Jacobians scale with the number of samples

Reverse rules for PerturbedAdditive and PerturbedMultiplicative take a sum instead of a weighted mean. As a result we've been computing M * J instead of J this whole time

Clarify & demonstrate use of kwargs

At the moment it's not clear in the documentation which kwargs are given to which pipeline element at which time. Even I'm a bit lost ^^

Extend `DifferentiableFrankWolfe` beyond vectors

At the moment, DifferentiableFrankWolfe (see the giom branch) only works when θ is a vector, not a higher-dimensional array.
It should be easy to implement flattening / unflattening to generalize it.

Are you planning to update Python's inferopt in the future?

As the title

Allow more general objectives in combinatorial layers

Currently, InferOpt fully supports predictors of the form $\arg\max_y \theta^\top y$ in combinatorial layers.
It would be interesting to allow the more general form $$\arg\max_y \theta^\top g(y) + h(y)$$
For the moment, a workaround is to make your predictor return $g(y)$ instead of $y$. This way, gradient computation are correct, but loss value computations are not when $h\neq0$ (there is a missing term), therefore the training will work but the loss metric value will be slightly incorrect.

One implementation option would be to enforce returning the objective value as an additional output of the predictor.

Add identity pullback

https://arxiv.org/abs/2205.15213

Explain chain rules

I must make it easier for contributors to understand how chain rules are written. This requires:

basics of autodiff and VJPs
formulas

Add Noise Contrastive Loss

https://www.ijcai.org/proceedings/2021/0390.pdf

Variance reduction for `PerturbedAdditive` and `PerturbedMultiplicative`

We have a generic implementation of a PerturbedOracle that also does variance reduction.

Beware however: for arbitrary perturbation distributions, the Fenchel-Young loss must be reworked

Get rid of ThreadsX?

We can speed up loading of InferOpt by replacing ThreadsX with built-in threads:

res = ThreadsX.map(f(i) for i in 1:n)

would become

f1 = f(1)
res = Vector{typeof(f1)}(undef, n)
res[1] = f1
@threads for i in 2:n
    res[i] = f(i)
end

Downsides:

less readable code
performance hit due to the first element: complexity 1 + (n-1)/t instead of n/t for t threads

Upsides:

less dependencies, faster loading of InferOpt
improved type stability?

Get rid of SimpleTraits?

I think traits are a hassle for users who want a quick and easy implem. If the required methods do not exist for their structures, Julia will throw an error anyway. I suggest we get rid of SimpleTraits and just use unconstrained type parameters for regularization and imitation losses.

Is there a downside to this?

DiffOpt.jl
ImplicitDifferentiation.jl
StochasticAD.jl

Python:

JAXopt
PyEPO
storchastic

Slight error in the background section of the documentation

In this page of the documentation, $f$ if defined as the $\max$, but should be an $\arg\max$, because:

only the $\arg\max$ is piecewise constant,
the maximizer argument of InferOpt layers should return the $\arg\max$.

juliadecisionfocusedlearning / inferopt.jl Goto Github PK

inferopt.jl's People

Contributors

Stargazers

Watchers

Forkers

inferopt.jl's Issues

Recommend Projects

Recommend Topics

Recommend Org