sciml / surrogates.jl Goto Github PK

Surrogate modeling and optimization for scientific machine learning (SciML)

Home Page: https://docs.sciml.ai/Surrogates/stable/

License: Other

Julia 100.00%

surrogate-models surrogate-based-optimization surrogates surrogate sciml scientific-machine-learning high-performance-computing automatic-differentiation differential-equations julia

surrogates.jl's Introduction

Surrogates.jl

A surrogate model is an approximation method that mimics the behavior of a computationally expensive simulation. In more mathematical terms: suppose we are attempting to optimize a function f(p), but each calculation of f is very expensive. It may be the case we need to solve a PDE for each point or use advanced numerical linear algebra machinery, which is usually costly. The idea is then to develop a surrogate model g which approximates f by training on previous data collected from evaluations of f. The construction of a surrogate model can be seen as a three-step process:

Sample selection
Construction of the surrogate model
Surrogate optimization

Sampling can be done through QuasiMonteCarlo.jl, all the functions available there can be used in Surrogates.jl.

ALL the currently available surrogate models:

Kriging
Kriging using Stheno
Radial Basis
Wendland
Linear
Second Order Polynomial
Support Vector Machines (Wait for LIBSVM resolution)
Neural Networks
Random Forests
Lobachevsky
Inverse-distance
Polynomial expansions
Variable fidelity
Mixture of experts (Waiting GaussianMixtures package to work on v1.5)
Earth
Gradient Enhanced Kriging

ALL the currently available optimization methods:

SRBF
LCBS
DYCORS
EI
SOP
Multi-optimization: SMB and RTEA

Installing Surrogates package

using Pkg
Pkg.add("Surrogates")

surrogates.jl's People

Contributors

Stargazers

Watchers

Forkers

shreyasfadnavis dishebh sinsixx asylumcorp platawiec karajan9 patriciofarrell alcap23 mkg33 scientific-ml rohitrathore1 mlh-fellowship zdq3157870 tlienart lkampoli zgornel tubbz-alt stjordanis aditya-kamath fredcallaway mdsa3d zeta1999 5l1v3r1 j-fu ranjanan julienpascal standardgalactic jbrea marcoq charlesrsmith44 tpapp adrhill jiweiqi playfloor sharanry andreajparker thazhemadam jeffreysarnoff manasviatgithub arjunreddy07 vikram-s-narayan erikqqy st-- chronum94 pgimenez sleepingphd liuyxpp pitmonticone leticia-maria archermarx fjebaker tangwang-ustc lxc95 pnovoa xyz20 arnostrouwen ven-k dreycenfoiles hchen19 ranocha mortenpi rkube robbenroll sathvikbhagavan samuelbelko ashutosh-b-b weslleyspereira spinachboul dynamic-queries

surrogates.jl's Issues

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

Add new sampling methods:

Niederreiter (equivalent to Sobol with generic base which I have implemented)
Hammersley (equivalent to Halton already implemented)
Kronecker
Golden sequence

Reference to take paper links: http://extremelearning.com.au/unreasonable-effectiveness-of-quasirandom-sequences/

Integration with Diffeq

It would be cool to have a function that let's the user specify a Diffeq.jl prob(), but solving it with Surrogates.

Improving RBF functionalities

As discussed with @PatricioFarrel , here's a list of stuff that can be added to Radials:
(Something already in #115)

Let the user pick the function but then choose automatically the "q" parameters required for the polynomial degree
Scaling parameter to avoid numerical instability with few samples
Sparse constructor
Matrix free constructor
Multi level methods for solving the system
Possibility of using iterative solvers

Missing dependency?

I get this error message, trying to precompile Surrogates (with julia 1.5.3)

ERROR: LoadError: LoadError: ArgumentError: Package Surrogates does not have XGBoost in its dependencies:                            
- If you have Surrogates checked out for development and have                                                                        
  added XGBoost as a dependency but haven't updated your primary                                                                     
  environment's manifest file, try `Pkg.resolve()`.                                                                                  
- Otherwise you may need to report an issue with Surrogates

After ] add XGBoost, it gets past XGBoost (giving a warning version of the error above) and then errors for Flux.

Not all sampling functions return the same shape, and shape not tested

Not all of them return arrays of tuples. This should get tested.

Add a new Optim.jl Optimizer via surrogates

Re-enable LIBSVM tests when it stops having issues

JuliaML/LIBSVM.jl#48

ND Polynomial basis in Radials

At the moment, in Radials.jl we have two different polynomial basis:

In the 1D case we have _scaled_chebyshev, which makes sense and well behaved.
For the ND (say 2D) case, at the moment I have "centralized_monomial" which however is not a basis, because it only has values of the form (xyz-central point)^k/(lb-ub/2) with k varying. It is not important to devise a well performing basis, for example the equivalent of (1,x,x^2,..) could be used given that at maximum we go up to degree 2.

More info here:
response_surface.pdf

Benchmarking

It would be cool to test many surrogates against benchmarks from: https://smt.readthedocs.io/en/latest/_src_docs/problems.html and https://www.sfu.ca/~ssurjano/optimization.html

Also the pollu diffeq system would be a good benchmark to have, much more difficult than the others and could work as the optimization problem (#131)

Documentation improvements

Not every optimizer/surrogate/sampler has a clear example what parameters have to be passed for it to work.
Often the box around a code snippet is missing, and there are missing doc-strings.

http://surrogates.sciml.ai/dev/optimizations/
http://surrogates.sciml.ai/dev/surrogate/
http://surrogates.sciml.ai/dev/samples/

Stheno is mentioned on the front page as a possible surrogate, but the example of how to use it seems to have been removed.
I like using Stheno.jl (and GaussianProcesses.jl) as my surrogate as they provide a maximum likelihood method to automatically tune the hyper parameters of the Gaussian process.
The description on how to define a new surrogate might be incomplete. Don't some optimizers also require the variance, and not only the expectation?

Representation of `x`

https://github.com/JuliaDiffEq/Surrogates.jl/blob/master/src/Radials.jl#L94 shows some issues. x here is being used row-major, when the fast dimension is the other way around. I think this kind of handling of x gives the packages somewhat of a confusing flow and causes a difference between 1D and ND when there really shouldn't be. Instead, x should be an array of values which are the x1,x2,x3,.... For example, x = [3,4,5] is 1D, while x = [(1,2),(3,4),(5,6)] is a 2D representation, or that can be an array of arrays, or an array of static arrays, or ... it doesn't matter as long as the interface just assumes an AbstractArray interface on the underlying x. In this way, x[i] is always the ith point in either form, and in the array of structs version it will both be contiguous (faster) and it will be easier to add new points (just a push! or append! operation, instead of allocating a new matrix)

SOP ND

Now that SOP 1D is complete, it should be quite straightforward to add another method for the ND case, using the other surrogate optimization method for aid as well as the following paper:

SOP: parallel surrogate global optimization with Pareto center selection for computationally expensive single objective problems by Tipaluck Krityakierne1 · Taimoor Akhtar2,3 · Christine A. Shoemaker2,3,4

More surrogate modeling methods

From https://smt.readthedocs.io/en/latest/

Standard API with kwargs

Hi! Thanks for this neat package. I just think it would be nice if all the main surrogate constructors had the same arguments X and y with default values for their kwargs which may be unique to each method. For example, a dense NN surrogate with 3 layers can be the default for NN surrogates, the lb and ub arguments needed sometimes can be calculated from X, etc. This would make trying things out for first time users much easier.

EI Optimization fails silently

Nothing is returned if maxiters iterations are exceeded (also in the 1D version):

https://github.com/JuliaDiffEq/Surrogates.jl/blob/6648536bd13ff5bbe88c852e69be091dc2136372/src/Optimization.jl#L586

It would be great if a descriptive message was returned, since I'm not sure why my test problem is failing (particularly when maxiters isn't much larger than num_new_samples)

Parameterize the types of the other args on the structs

For type-stability

Singular Interpolation Matrices when using Radial Basis Functions

There are two (related) circumstances when building interpolation matrices for radial basis functions where the resulting interpolation matrix is singular:

When there are many samples and q>0. This only occurs for multivariable inputs, so it likely has to do with the implementation of the monomial basis.

using Surrogates

f = x -> x[1]*x[2]
lb = [1.0,2.0]
ub = [10.0,8.5]
x = sample(500,lb,ub,SobolSample())
y = f.(x)
linear = z -> norm(z)
my_radial_basis = RadialBasis(x,y,[lb,ub],linear,0) # this doesn't error
my_radial_basis = RadialBasis(x,y,[lb,ub],linear,1) # this errors

When there are noisy observations (multiple same x, different y). This makes sense, but I wonder if the method should automatically take the mean of the observations (or whatever is appropriate for the given surrogate). Maybe more pertinently, the Kriging surrogate also errors for this, and Kriging should be able to handle this.

using Surrogates

f = x -> x[1]*x[2]
lb = [1.0,2.0]
ub = [10.0,8.5]
x = sample(5,lb,ub,SobolSample())
y = f.(x)
push!(x, first(x))
push!(x, first(x))
push!(y, first(y)*1.1)
push!(y, first(y)*0.9)
linear = z -> norm(z)
my_radial_basis = RadialBasis(x,y,[lb,ub],linear,1) # this errors
my_radial_basis = Kriging(x, y, fill(1.0, length(lb)), fill(1.0, length(lb))) # this errors

Additionally, when setting the q parameter (controlling the order of the polynomial) to 0, the predict function throws an error:

using Surrogates

f = x -> x[1]*x[2]
lb = [1.0,2.0]
ub = [10.0,8.5]
x = sample(50,lb,ub,SobolSample())
y = f.(x)
linear = z -> norm(z)
my_radial_basis = RadialBasis(x,y,[lb,ub],linear,0)
my_radial_basis(lb) # this errors

This is easily fixed by explicitly checking for iszero(q) in (r::RadialBasis).

Surrogate optimization methods

From: https://pysot.readthedocs.io/en/latest/options.html#strategy

Proposal: separate surrogate settings from x, y points

Motivation

Although the surrogate type defines the surrogate construction, it cannot act as a generic algorithmic dispatch if the user already has a set of points (x, y).

The proposed design tracks with the DiffEq ecosystem closer, as a bonus.

Current State

Currently, each AbstractSurrogate includes both the surrogate algorithm and the x, y points the surrogate is generated over. The surrogate hyperparameters are therefore mixed in the struct definition with the x, y points, which exist independently. Some surrogates also get passed an lb, ub pair, which isn't necessarily used in the surrogate generation. Furthermore, the constructor for the surrogate does a lot of work in order to return the final type.

Proposal

One way to resolve this is to separate out the fitted surrogate from the surrogate algorithm used.

mutable struct FittedSurrogate{X, Y, S <: AbstractSurrogate, SS}
    x::X
    y::Y
    surrogate::S
    surrogate_state::SS
end

A new fit method can be defined which generates the surrogate based on the passed algorithm. Any internal parameters can be held within a surrogate_cache type which would remain unspecified.

Looking at a more concrete example, taking the NueralSurrogate example from the documentation:

using Surrogates
using Flux
using Statistics

f = x -> x[1]^2 + x[2]^2
bounds = Float32[-1.0, -1.0], Float32[1.0, 1.0]
# Flux models are in single precision by default.
# Thus, single precision will also be used here for our training samples.

x_train = sample(100, bounds..., SobolSample())
y_train = f.(x_train)

# Perceptron with one hidden layer of 20 neurons.
model = Chain(Dense(2, 20, relu), Dense(20, 1))
loss(x, y) = Flux.mse(model(x), y)

# Training of the neural network
learning_rate = 0.1
optimizer = Descent(learning_rate)  # Simple gradient descent. See Flux documentation for other options.
n_epochs = 50
sgt_model = NeuralSurrogate(model=model, loss=loss, opt=optimizer, n_echos=n_epochs)
sgt = fit(x_train, y_train, sgt_model)

# Testing the new model
x_test = sample(30, bounds..., SobolSample())
test_error = mean(abs2, sgt(x)[1] - f(x) for x in x_test)

A linear surrogate would just be

struct LinearSurrogate <: AbstractSurrogate end

my_linear_surr_1D = fit(x, y, LinearSurrogate())

Potential Issues

Here I proposed a design where the surrogate state (= the result of fitting) is stored with the surrogate in the FittedSurrogate type. Another way we could do this is to store the surrogate state in the algorithmic dispatch type. However, we would need to specify the types of the x, y points ahead of time. In this design the surrogate state is separate, kind of like ODEIntegrator, where there is an alg and a separate cache for the alg. As a pathological example, in the case of the NeuralSurrogate, the internal model state would be updated during the fitting process because the model structure (currently) is held as a hyperparameter for the surrogate. I'm not sure how to cleanly solve the issue without a deepcopy of the model into the surrogate state type.

Interested to hear what you think!

Earth, MOE and GEK not working properly in some cases

Need to look into cases where these two surrogates are not working as expected

Earth
MOE
GEK

Adaptive/Sequential Sampling

For expensive simulations, especially where one is not interested the optimum of the surrogate, but rather the general response function to input parameters, it may be beneficial to intelligently plan the sampling based on the knowledge of the current surrogate model. In some sense, this is similar to what the surrogate_optimize function does. This link is made explicitly in the docstring for SRBF, where a parameter tunes the trade-off between exploration and exploitation. However, some algorithms never exploit (and some algorithms never purely explore), so it seems to me that there is a distinction between the "optimization" class of algorithms and the "adaptive sampling/experimental design" set of algorithms.

Is this point of view correct? Should there be an interface distinction between sampling / adaptive sampling / optimization? Or should it remain as now (sampling / [adaptive sampling for certain settings, optimization for most settings])? What would an appropriate interface for adaptive sampling tasks look like?

See this master's thesis for a survey of algorithms: https://arxiv.org/pdf/1905.05345.pdf

Compactly Supported Basis Functions

In general, basis functions produce dense interpolation matrices. However, there is a special class of basis functions called "Compactly Supported (CS)" that produce sparse interpolation matrices. Additionally, these lend themselves well to matrix-free representations. This a must for large scale problems. Lobachevsky Splines are an example of a CS basis function.

Given a finite L-spline n parameter, the basis function is CS on the d-dimensional box defined by [-sqrt(3n)/alpha_i, sqrt(3n)/alpha_i], i = 1,...d ( see #51 for dimension dependent alpha). So, all nodes outside this support will produce 0.0's in the interpolation matrix. In Leonard_dissertation_final.pdf, Andrew was able to solve a 500k nodes system exploiting this. A dense representation with float32 would require 1000 GB for the interpolation matrix. For his case, alpha's were selected to produce 99% sparsity. However, even a sparse representation would require 30 GB! So a matrix-free approach was used. This should be simple in Julia with LinearMaps.jl and IterativeSolvers.jl. I have a prototype of this working and I will post it once I clean it up.

Another popular set of CS radial basis functions is Wendland's CSRBF A good resource is Fasshauer 2007 and 2015

It may be worth considering a "compact" type for this class of functions.

A benefit of using CS basis functions is that you can apply Partition of Unity (Fasshauer 2007 has a very approachable introduction to this.) "This approach offers a simple way to decompose a large problem into many small problems while at the same time ensuring that the accuracy obtained for the local fits is carried over to the global fit." I have not seen POU implemented in Julia, but there is some discussion here

Docs: examples for each surrogate

Under tutorials: instead of "Basics" and "Stheno with Kriging", it would be nice to have a tutorial for each surrogates both 1D and ND, showcasing different features and calling and optimization method.

Fixing docstrings

Docstrings are not up to date with recent changes, it would be nice to have them fixed.

Gradient-enhanced surrogates

Gradient enhanced kriging (https://smt.readthedocs.io/en/latest/_src_docs/surrogate_models/gekpls.html)
Gradient enhanced neural networks (https://smt.readthedocs.io/en/latest/_src_docs/surrogate_models/genn.html)

Showing how to optimize surrogates hyperparams with Hyperopt.jl

Other AbstractSurrogates

Linear surrogate. A new AbstractSurrogate which fits a linear model. This can be considered the most basic form of a surrogate, though it's an L2 fit so not that great in practice.
Random forest surrogate.
Neural surrogate.
Gaussian process surrogate. Just focusing on single output values.
Multivariate Adaptive Regression Splines (MARS) https://pysot.readthedocs.io/en/latest/options.html#marsinterpolant
Support Vector Machines
Lobachevsky splines https://pdfs.semanticscholar.org/7a41/4569eed34c1924d2ce0e3be67c394d56c09d.pdf
MIS https://www.sciencedirect.com/science/article/pii/S1364815214001698

error when I am building a Surrogate using Radial Basis Surrogate

When I am building a Surrogate for

\begin{align*}
f(x) = log(x)x^2 + x^3
\end{align}

which is shown in your tutorials. So, we choose the Radial Basis Sorrogate.

my_radial_basis = RadialBasis(x,y,lb,ub,thin_plate_spline,q)

Domain error

DomainError with -2.25:
log will only return a complex result if called with a complex argument. Try log(Complex(x)).

Stacktrace:
 [1] throw_complex_domainerror(::Symbol, ::Float64) at .\math.jl:33
 [2] log(::Float64) at .\special\log.jl:285
 [3] #5 at .\In[4]:1 [inlined]
 [4] _construct_rbf_interp_matrix(::Array{Float64,1}, ::Float64, ::Float64, ::Float64, ::var"#5#6", ::Int64) at C:\Users\TeAmp0is0N\.julia\packages\Surrogates\Kf76T\src\Radials.jl:59
 [5] _calc_coeffs(::Array{Float64,1}, ::Array{Float64,1}, ::Float64, ::Float64, ::Function, ::Int64) at C:\Users\TeAmp0is0N\.julia\packages\Surrogates\Kf76T\src\Radials.jl:45
 [6] RadialBasis(::Array{Float64,1}, ::Array{Float64,1}, ::Float64, ::Float64, ::Function, ::Int64) at C:\Users\TeAmp0is0N\.julia\packages\Surrogates\Kf76T\src\Radials.jl:25
 [7] top-level scope at In[5]:1

After this, if I called log with a complex argument. It throws inexact error.

InexactError: Float64(4.105334219595164 + 15.904312808798327im)

Stacktrace:
 [1] Real at .\complex.jl:37 [inlined]
 [2] convert at .\number.jl:7 [inlined]
 [3] setindex! at .\array.jl:828 [inlined]
 [4] _construct_rbf_interp_matrix(::Array{Float64,1}, ::Float64, ::Float64, ::Float64, ::var"#7#8", ::Int64) at C:\Users\TeAmp0is0N\.julia\packages\Surrogates\Kf76T\src\Radials.jl:59
 [5] _calc_coeffs(::Array{Float64,1}, ::Array{Float64,1}, ::Float64, ::Float64, ::Function, ::Int64) at C:\Users\TeAmp0is0N\.julia\packages\Surrogates\Kf76T\src\Radials.jl:45
 [6] RadialBasis(::Array{Float64,1}, ::Array{Float64,1}, ::Float64, ::Float64, ::Function, ::Int64) at C:\Users\TeAmp0is0N\.julia\packages\Surrogates\Kf76T\src\Radials.jl:25
 [7] top-level scope at In[8]:1

You can also view my notebook.

Parameterization of functions

https://github.com/JuliaDiffEq/Surrogates.jl/blob/master/src/Radials_1D.jl#L14 that should be ::F with a parameter since Function is an abstract type, leading to type instabilities

Review for Radials_1D and Kriging_1D needed

@ChrisRackauckas

Changing bounds in RadialBasis function

https://github.com/JuliaDiffEq/Surrogates.jl/blob/de12188c0bd263870628701439474564196a32e8/src/Radials.jl#L112
Looking at the implementation of RadialBasis, it is confusing because there is a single
bound = [lb,ub], while all the other surrogates have separate lb,ub

DENSE surrogate

https://www.sciencemag.org/news/2020/02/models-galaxies-atoms-simple-ai-shortcuts-speed-simulations-billions-times

https://arxiv.org/pdf/2001.08055.pdf

SOP

Hi,

My understanding from the documentation is that SOP can run with several workers a cluster. Is that correct?

If yes, could you provide an example on how to do so?

Many thanks.

SVMSurrogate assumes input size to be (1,2)

I get a reshape error when evaluation the generated model because the code assumes the size of the input variables to be (1,2) in line 35 of SVMSurrogate.jl


function (svmsurr::SVMSurrogate)(val)
    return LIBSVM.predict(svmsurr.model,reshape(collect(val),1,2))[1]
end

Surrogates v1.2.0 Does not precompile on v1.5

julia> using Surrogates
[ Info: Precompiling Surrogates [6fc51010-71bc-11e9-0e15-a3fcc6593c49]
â�� Warning: Assignment to `#s262` in soft scope is ambiguous because a global variable by the same name exists: `#s262` will be treated as a new local. Disambiguate by using `local #s262` to suppress this warning or `global #s262` to assign to the existing global variable.
â�� @ /data/ranjanan/.julia/packages/Stheno/II2B3/src/util/abstract_data_set.jl:39
â�� Warning: Assignment to `#s262` in soft scope is ambiguous because a global variable by the same name exists: `#s262` will be treated as a new local. Disambiguate by using `local #s262` to suppress this warning or `global #s262` to assign to the existing global variable.
â�� @ /data/ranjanan/.julia/packages/Stheno/II2B3/src/util/distances.jl:2
â�� Warning: Assignment to `#s263` in soft scope is ambiguous because a global variable by the same name exists: `#s263` will be treated as a new local. Disambiguate by using `local #s263` to suppress this warning or `global #s263` to assign to the existing global variable.
â�� @ /data/ranjanan/.julia/packages/Stheno/II2B3/src/abstract_gp.jl:24
â�� Warning: Assignment to `#s263` in soft scope is ambiguous because a global variable by the same name exists: `#s263` will be treated as a new local. Disambiguate by using `local #s263` to suppress this warning or `global #s263` to assign to the existing global variable.
â�� @ /data/ranjanan/.julia/packages/Stheno/II2B3/src/abstract_gp.jl:333
â�� Warning: Assignment to `#s262` in soft scope is ambiguous because a global variable by the same name exists: `#s262` will be treated as a new local. Disambiguate by using `local #s262` to suppress this warning or `global #s262` to assign to the existing global variable.
â�� @ /data/ranjanan/.julia/packages/Stheno/II2B3/src/gp/kernel.jl:102
â�� Warning: Assignment to `#s262` in soft scope is ambiguous because a global variable by the same name exists: `#s262` will be treated as a new local. Disambiguate by using `local #s262` to suppress this warning or `global #s262` to assign to the existing global variable.
â�� @ /data/ranjanan/.julia/packages/Stheno/II2B3/src/gp/kernel.jl:216
â�� Warning: Assignment to `#s262` in soft scope is ambiguous because a global variable by the same name exists: `#s262` will be treated as a new local. Disambiguate by using `local #s262` to suppress this warning or `global #s262` to assign to the existing global variable.
â�� @ /data/ranjanan/.julia/packages/Stheno/II2B3/src/gp/kernel.jl:251
â�� Warning: Assignment to `#s262` in soft scope is ambiguous because a global variable by the same name exists: `#s262` will be treated as a new local. Disambiguate by using `local #s262` to suppress this warning or `global #s262` to assign to the existing global variable.
â�� @ /data/ranjanan/.julia/packages/Stheno/II2B3/src/gp/kernel.jl:296
â�� Warning: Assignment to `#s262` in soft scope is ambiguous because a global variable by the same name exists: `#s262` will be treated as a new local. Disambiguate by using `local #s262` to suppress this warning or `global #s262` to assign to the existing global variable.
â�� @ /data/ranjanan/.julia/packages/Stheno/II2B3/src/gp/kernel.jl:329
â�� Warning: Assignment to `#s263` in soft scope is ambiguous because a global variable by the same name exists: `#s263` will be treated as a new local. Disambiguate by using `local #s263` to suppress this warning or `global #s263` to assign to the existing global variable.
â�� @ /data/ranjanan/.julia/packages/Stheno/II2B3/src/gp/kernel.jl:438
â�� Warning: Assignment to `#s263` in soft scope is ambiguous because a global variable by the same name exists: `#s263` will be treated as a new local. Disambiguate by using `local #s263` to suppress this warning or `global #s263` to assign to the existing global variable.
â�� @ /data/ranjanan/.julia/packages/Stheno/II2B3/src/gp/kernel.jl:474
â�� Warning: Assignment to `#s262` in soft scope is ambiguous because a global variable by the same name exists: `#s262` will be treated as a new local. Disambiguate by using `local #s262` to suppress this warning or `global #s262` to assign to the existing global variable.
â�� @ /data/ranjanan/.julia/packages/Stheno/II2B3/src/gp/kernel.jl:510
â�� Warning: Assignment to `#s263` in soft scope is ambiguous because a global variable by the same name exists: `#s263` will be treated as a new local. Disambiguate by using `local #s263` to suppress this warning or `global #s263` to assign to the existing global variable.
â�� @ /data/ranjanan/.julia/packages/Stheno/II2B3/src/gp/gp.jl:3
â�� Warning: Assignment to `#s263` in soft scope is ambiguous because a global variable by the same name exists: `#s263` will be treated as a new local. Disambiguate by using `local #s263` to suppress this warning or `global #s263` to assign to the existing global variable.
â�� @ /data/ranjanan/.julia/packages/Stheno/II2B3/src/composite/approximate_conditioning.jl:57
ERROR: LoadError: LoadError: UndefVarError: save not defined
Stacktrace:
 [1] getproperty(::Module, ::Symbol) at ./Base.jl:26
 [2] top-level scope at /data/ranjanan/.julia/packages/GaussianMixtures/3jRIL/src/io.jl:7
 [3] include(::Function, ::Module, ::String) at ./Base.jl:380
 [4] include at ./Base.jl:368 [inlined]
 [5] include(::String) at /data/ranjanan/.julia/packages/GaussianMixtures/3jRIL/src/GaussianMixtures.jl:6
 [6] top-level scope at /data/ranjanan/.julia/packages/GaussianMixtures/3jRIL/src/GaussianMixtures.jl:32
 [7] include(::Function, ::Module, ::String) at ./Base.jl:380
 [8] include(::Module, ::String) at ./Base.jl:368
 [9] top-level scope at none:2
 [10] eval at ./boot.jl:331 [inlined]
 [11] eval(::Expr) at ./client.jl:467
 [12] top-level scope at ./none:3
in expression starting at /data/ranjanan/.julia/packages/GaussianMixtures/3jRIL/src/io.jl:7
in expression starting at /data/ranjanan/.julia/packages/GaussianMixtures/3jRIL/src/GaussianMixtures.jl:32
ERROR: LoadError: LoadError: Failed to precompile GaussianMixtures [cc18c42c-b769-54ff-9e2a-b28141a64aae] to /data/ranjanan/.julia/compiled/v1.5/GaussianMixtures/1kPVN_CKcEC.ji.
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] compilecache(::Base.PkgId, ::String) at ./loading.jl:1290
 [3] _require(::Base.PkgId) at ./loading.jl:1030
 [4] require(::Base.PkgId) at ./loading.jl:928
 [5] require(::Module, ::Symbol) at ./loading.jl:923
 [6] include(::Function, ::Module, ::String) at ./Base.jl:380
 [7] include at ./Base.jl:368 [inlined]
 [8] include(::String) at /data/ranjanan/.julia/packages/Surrogates/nRS9U/src/Surrogates.jl:1
 [9] top-level scope at /data/ranjanan/.julia/packages/Surrogates/nRS9U/src/Surrogates.jl:19
 [10] include(::Function, ::Module, ::String) at ./Base.jl:380
 [11] include(::Module, ::String) at ./Base.jl:368
 [12] top-level scope at none:2
 [13] eval at ./boot.jl:331 [inlined]
 [14] eval(::Expr) at ./client.jl:467
 [15] top-level scope at ./none:3
in expression starting at /data/ranjanan/.julia/packages/Surrogates/nRS9U/src/MOE.jl:2
in expression starting at /data/ranjanan/.julia/packages/Surrogates/nRS9U/src/Surrogates.jl:19
ERROR: Failed to precompile Surrogates [6fc51010-71bc-11e9-0e15-a3fcc6593c49] to /data/ranjanan/.julia/compiled/v1.5/Surrogates/qZF7j_CKcEC.ji.
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] compilecache(::Base.PkgId, ::String) at ./loading.jl:1290
 [3] _require(::Base.PkgId) at ./loading.jl:1030
 [4] require(::Base.PkgId) at ./loading.jl:928
 [5] require(::Module, ::Symbol) at ./loading.jl:923

Lobachevsky-Spline: Integrating select dimensions

One of the benefits of Lobachevsky-Splines is that, not only can you compute the integral of the whole domain, but you can compute the integral of a sub-set of the dimensions, e.g. marginalize a pdf.

Example usage: given R^N->R data

fit an N dim L-Spline
Integrate dimension i, which updates the coefficients
return a new N-1 dim L-Spline.

Integrate Radial and Kriging with samplings

Surrogate Global Optimization Challenge Problem

It's probably a good time to start thinking about getting a challenge problem here. I tried using the surrogate optimization for global parallel optimization and ran into scaling issues with large design matrices as it was iteratively adding a point at a time. The case I was working on was proprietary so I cannot share it, but I think it would be good to get an open challenge problem to start tackling here since I think we're close but not all the way there yet.

@jlperla I think you might have an example?

Pkg.add("Surrogates") is installing v1.1.2 , not the latest v1.2

Hey guys,

I was trying to install the new version 1.2 to fix the XGBoost bug, but Pkg.add("Surrogates") install the previous version v1.1.2

When I try to force a specific version with Pkg.add(Pkg.PackageSpec(;name="Surrogates", version="1.2")) , I get the error:

Unsatisfiable requirements detected for package Surrogates [6fc51010]:
Surrogates [6fc51010] log:
├─possible versions are: [0.1.0-0.1.1, 0.2.0, 0.3.0, 0.4.0, 0.5.0, 1.0.0-1.0.1, 1.1.2] or uninstalled
└─restricted to versions 1.2 by an explicit requirement — no versions left

Can you help me with this ? Thank you in advance

Handle infinite/NaN values

MWE of poorly conditioned objective function

using Surrogates, Stheno, Distributions
d = Normal()
dat = rand(d, 300)
function test(x)
    x = collect(x)
    μ = x[1]
    logsigma = x[2]
    σ = exp(logsigma)
    d2 = Normal(μ, σ)
    -log(exp(loglikelihood(d2, dat)))
end
#optimize(test, [1., 1.], BFGS(), autodiff = :forward) Works fine with Optim
ig = [1., 1.]
test2(x) = test(collect(x))
l = ig.*0.5
u = ig .* 2.
s = Surrogates.sample(20, l, u, SobolSample())
fs = test2.(s)
ind = isfinite.(fs)
σ² = 0.05
gp = Stheno.GP(σ² * stretch(matern52(), 1.), Stheno.GPC())
my_krig = SthenoKriging(s[ind], fs[ind], gp)
surrogate_optimize(test, LCBS(), l, u, my_krig, SobolSample())

I actually think there is a seperate issue here. When I run this example, I get

ERROR: PosDefException: matrix is not positive definite; Cholesky factorization failed.
Stacktrace:
 [1] checkpositivedefinite at /home/andrew/Programs/julia-1.3.1-linux-x86_64/julia-1.3.1/share/julia/stdlib/v1.3/LinearAlgebra/src/factorization.jl:18 [inlined]
 [2] #cholesky!#124(::Bool, ::typeof(LinearAlgebra.cholesky!), ::LinearAlgebra.Symmetric{Float64,Array{Float64,2}}, ::Val{false}) at /home/andrew/Programs/julia-1.3.1-linux-x86_64/julia-1.3.1/share/julia/stdlib/v1.3/LinearAlgebra/src/cholesky.jl:226
 [3] #cholesky! at ./none:0 [inlined] (repeats 2 times)
 [4] #cholesky#129 at /home/andrew/Programs/julia-1.3.1-linux-x86_64/julia-1.3.1/share/julia/stdlib/v1.3/LinearAlgebra/src/cholesky.jl:348 [inlined]
 [5] cholesky at /home/andrew/Programs/julia-1.3.1-linux-x86_64/julia-1.3.1/share/julia/stdlib/v1.3/LinearAlgebra/src/cholesky.jl:348 [inlined] (repeats 2 times)
 [6] |(::GP{Stheno.ZeroMean{Float64},Stheno.EQ}, ::Stheno.Observation{Stheno.FiniteGP{Stheno.CompositeGP{Tuple{typeof(LinearAlgebra.cross),Array{GP{Stheno.ZeroMean{Float64},Stheno.EQ},1}}},Stheno.BlockData{Array{Float64,1},ColVecs{Float64,Array{Float64,2}}},BlockArrays.BlockArray{Float64,2,LinearAlgebra.Diagonal{LinearAlgebra.Diagonal{Float64,Array{Float64,1}},Array{LinearAlgebra.Diagonal{Float64,Array{Float64,1}},1}},BlockArrays.BlockSizes{2,Tuple{Array{Int64,1},Array{Int64,1}}}}},Array{Float64,1}}) at /home/andrew/.julia/packages/Stheno/j89z0/src/composite/conditioning.jl:9
 [7] map at /home/andrew/.julia/packages/Stheno/j89z0/src/composite/conditioning.jl:69 [inlined]
 [8] |(::Tuple{GP{Stheno.ZeroMean{Float64},Stheno.EQ}}, ::Tuple{Stheno.Observation{Stheno.FiniteGP{GP{Stheno.ZeroMean{Float64},Stheno.EQ},ColVecs{Float64,Array{Float64,2}},LinearAlgebra.Diagonal{Float64,Array{Float64,1}}},Array{Float64,1}}}) at /home/andrew/.julia/packages/Stheno/j89z0/src/composite/conditioning.jl:71
 [9] _condition_gps(::ColVecs{Float64,Array{Float64,2}}, ::Array{Array{Float64,1},1}, ::Tuple{GP{Stheno.ZeroMean{Float64},Stheno.EQ}}, ::Float64) at /home/andrew/.julia/packages/Surrogates/dfpEW/src/SthenoKriging.jl:95
 [10] _prepare_gps at /home/andrew/.julia/packages/Surrogates/dfpEW/src/SthenoKriging.jl:82 [inlined]
 [11] add_point!(::SthenoKriging{Array{Tuple{Float64,Float64},1},Array{Float64,1},Tuple{GP{Stheno.ZeroMean{Float64},Stheno.EQ}},Float64,Tuple{Stheno.CompositeGP{Tuple{typeof(|),GP{Stheno.ZeroMean{Float64},Stheno.EQ},LinearAlgebra.Cholesky{Float64,Array{Float64,2}},Array{Float64,1},Stheno.FiniteGP{Stheno.CompositeGP{Tuple{typeof(LinearAlgebra.cross),Array{GP{Stheno.ZeroMean{Float64},Stheno.EQ},1}}},Stheno.BlockData{Array{Float64,1},ColVecs{Float64,Array{Float64,2}}},BlockArrays.BlockArray{Float64,2,LinearAlgebra.Diagonal{LinearAlgebra.Diagonal{Float64,Array{Float64,1}},Array{LinearAlgebra.Diagonal{Float64,Array{Float64,1}},1}},BlockArrays.BlockSizes{2,Tuple{Array{Int64,1},Array{Int64,1}}}}},Array{Float64,1}}}}}, ::Tuple{Float64,Float64}, ::Float64) at /home/andrew/.julia/packages/Surrogates/dfpEW/src/SthenoKriging.jl:75

I get the same error even before the optimization step if I construct the Sobol grid with 100 points instead of 20. I'm not really sure why. It doesn't seem to have anything to do with the objective. I got the same thing when I used random function values.

Anyway, regardless, this objective function will be infinite valued at some points due to poor numerical conditioning. What should Surrogates do in that case? Decline to add that value and try something else?

On my actual problem I didn't get this positive definite issue. Instead, it ran the optimization step for about 10 points, then said

Out of sampling points

Naive Multi output surrogates

Following the closed PR #106, there is only the need to make the remaining Surrogates multi output, with the exception of Kriging and Lobachesky, in the rest every output is fit separately.

Add benchmarks to highlight optimization methods

https://www.sfu.ca/~ssurjano/optimization.html

Typed basis functions

The basis functions are just asking to be a struct. They should hold https://github.com/JuliaDiffEq/Surrogates.jl/blob/master/src/Kriging_1D.jl#L50-L53 . Then you should have basis(x) be the same evaluating new_point on that basis.

Register

@JuliaRegistrator register()

@JuliaRegistrator register subdir=lib/SurrogatesAbstractGPs
@JuliaRegistrator register subdir=lib/SurrogatesFlux
@JuliaRegistrator register subdir=lib/SurrogatesPolyChaos
@JuliaRegistrator register subdir=lib/SurrogatesRandomForest
@JuliaRegistrator register subdir=lib/SurrogatesSVM

Outcome constrained Bayesian optimization

Look for "outcome constraints" in
https://ax.dev/tutorials/building_blocks.html#4.-Define-an-optimization-config-with-custom-metrics

An example in SciML would be Bayesian optimization of a yield of a chemical reaction, but make sure some other product is not produced more than 1%.

Lobachevsky-Spline: Dimension dependent alpha

The alpha parameter scales the support the the L-spline basis function. The current implementation takes a scalar value for alpha and applies it for all dimensions, however you can have separate alphas for each dimension. Consider adding a function for accepting an array of alphas.

This is important if your support (ub - lb) has different scales.

Polynomial chaos (PC)

New surrogate based on https://arxiv.org/pdf/1703.05312.pdf

There is the need for fast computation of Legendre, Laguerre and hermite polynomials before tackling this. EDIT: This could be done with the Julia package Polychaos

Multidimensional Output Support

Hi, enjoying the package very much.

It's clear that the initial algorithms are focusing on single-output surrogate models i.e. f: R^n -> R. There is a broad class of problems which would benefit from multi-output surrogate models i.e. f: R^n -> R^k.

It seems like you've given this some thought but I was wondering what your plan was regarding these, and if you needed support in any areas. On the low-hanging fruit side, it could be as simple as allowing a function to be passed that returns a vector, and then re-running the surrogate individually (this seems to be how multi-output RBF is done in the literature). On the more advanced side, Kriging/Bayesian methods or Neural Nets can exploit any structure between the outputs.

My reading of the literature suggests:

Surrogate fitting is rerun for each output
- RadialBasis
- LobacheskySurrogate
User provides surrogate, Surrogates.jl interface just needs to change (should "just work")
- Neural
Surrogate fitting occurs across all outputs
- LinearSurrogate (I think the result is the same as rerunning for each)
- Kriging
- SVMSurrogate https://www.sciencedirect.com/science/article/abs/pii/S0167865513000196
- RandomForestSurrogate https://www.semanticscholar.org/paper/Multivariate-random-forests-Segal-Xiao/5e730be365acd5879b271bc4a9bd6db815e1baf4

In the future, the question of how to do the surrogate optimization will need to be tackled, but I don't see it as a major blocker at the moment. Thoughts?

Load and Argument Error

┌ Info: Precompiling Surrogates [6fc51010-71bc-11e9-0e15-a3fcc6593c49]
└ @ Base loading.jl:1260
ERROR: LoadError: LoadError: ArgumentError: Package Surrogates does not have XGBoost in its dependencies:
- If you have Surrogates checked out for development and have
  added XGBoost as a dependency but haven't updated your primary
  environment's manifest file, try `Pkg.resolve()`.
- Otherwise you may need to report an issue with Surrogates
Stacktrace:
 [1] require(::Module, ::Symbol) at .\loading.jl:905
 [2] include(::Module, ::String) at .\Base.jl:377
 [3] include(::String) at C:\Users\TeAmp0is0N\.julia\packages\Surrogates\Kf76T\src\Surrogates.jl:1
 [4] top-level scope at C:\Users\TeAmp0is0N\.julia\packages\Surrogates\Kf76T\src\Surrogates.jl:20
 [5] include(::Module, ::String) at .\Base.jl:377
 [6] top-level scope at none:2
 [7] eval at .\boot.jl:331 [inlined]
 [8] eval(::Expr) at .\client.jl:449
 [9] top-level scope at .\none:3
in expression starting at C:\Users\TeAmp0is0N\.julia\packages\Surrogates\Kf76T\src\RandomForestSurrogate.jl:1
in expression starting at C:\Users\TeAmp0is0N\.julia\packages\Surrogates\Kf76T\src\Surrogates.jl:20