Comments (44)
The seed is set before the "inversion" loop, so each inversion is using a different set of random numbers. It's easy to check that since each run is returning a different optimal solution.
from evodcinv.
Hi @79seismo,
Please update evodcinv
to v2.2.1, that should prevent this issue from happening hopefully.
from evodcinv.
The first reason is API limitation: most optimization libraries (e.g., scipy
), if not all, simply only allow to calculate and store the misfit value for a model. Although I am the developer of stochopy
, I would need to make some substantial changes in the API to allow to store more than that.
The second reason is that evodcinv
only calculates the dispersion velocities at discrete input periods. When plotting results, we usually want to plot the "full" dispersion curves (i.e., at periods different that of the data points), so storing the dispersion curves calculated during the inversion doesn't look as useful given that constraint.
Anyway, after some benchmarks, the performance issues of the plotting routines are not due to the recalculation of the dispersion curves (it should only take a few seconds, especially after thresholding), there seem to be some issues with sequential plotting of thousands of lines with matplotlib
.
from evodcinv.
Hi @79seismo,
I am not sure to correctly understand your question. By default, when maxrun
is greater than 1, the result output contains all models visited by all runs. Thresholding the results returns an ensemble of models from all runs that satisfy the thresholding criterion.
from evodcinv.
Hi @79seismo,
All models and misfits are output by default (attributes models
and misfits
of evodcinv.InversionResult
).
Please check the script .github/sample.py
to see how the figures in the README have been generated.
Without any prior information, it would be good to set rather unconstraining velocity bounds (yet still realistic, i.e., we do not expect 5 km/s at the surface) and constant thicknesses for each layer. Also, set a large population size for the optimizer (e.g., 10 times the number of layers).
Vp is calculated from Vs and Poisson's ratio. By default, density is calculated using Nafe-Drake's equation, but you can set your own model (density
option in method configure
).
Not sure what you mean by "benchmark", but dispersion curves calculated by disba
are consistent with the ones from CPS for the velocity models I have tested. stochopy
(the optimization library used here) has been thoroughly tested during my PhD. Since evodcinv
is mainly relying on these two libraries, the forward modeling and the optimization algorithms are assumed to be working, thus the inversion code here should work properly. If it means anything, the example in the README is actually a sample problem from Geopsy's Wiki.
from evodcinv.
Thanks very much Keurfon. This is useful information. By benchmark I meant comparing results with an established code, which is what you may have done with CPS.
I tried inverting for the model with 10 layers in your example in disba, and although the misfit gets minimised well, the model fit isn't that great for CPSO and is a bit better with NA from what I can tell. Any tips on how I can improve the fit? I set the search ranges as follows for all the layers model.add(Layer([1, 20], [1, 5.0]))
Can I get a copy of your PhD thesis as well please? Thanks!
from evodcinv.
Please note that thickness is defined in km and velocity in km/s.
For the sample problem, the model layers can be configured as follows (assuming 10 layers of constant thickness h=5m down to 50 m depth):
for _ in range(10):
model.add(Layer([0.005, 0.005], [0.1, 3.0]))
Set the population size to 100 and increase the number of iterations to 1000-2000 to be sure to reach convergence. You may also try different optimizers as you did with NA. CMA-ES seems to work quite well here:
model.configure(
optimizer="cmaes",
misfit="rmse",
density=lambda vp: 2.0,
optimizer_args={
"popsize": 100,
"maxiter": 2000,
"workers": -1,
"seed": 0,
},
)
You can also increase the number of runs (e.g., to 3) to start from different initial populations.
res = model.invert(curves, maxrun=3, split_results=True)
res = min(res, key=lambda x: x.misfit) # Get the results for the run with the lowest misfit
Note that you likely will never get the exact true model due to the non-uniqueness of the solution.
--------------------------------------------------------------------------------
Best model out of 200000 models (1 run)
Velocity model Model parameters
---------------------------------------- ------------------------------
d vp vs rho d vs nu
[km] [km/s] [km/s] [g/cm3] [km] [km/s] [-]
---------------------------------------- ------------------------------
0.0050 0.3372 0.2065 2.0000 0.0050 0.2065 0.2000
0.0050 0.3506 0.2147 2.0000 0.0050 0.2147 0.2000
0.0050 0.5330 0.2263 2.0000 0.0050 0.2263 0.3901
0.0050 0.5984 0.2443 2.0000 0.0050 0.2443 0.4000
0.0050 0.2664 0.1632 2.0000 0.0050 0.1632 0.2000
0.0050 1.3206 0.5391 2.0000 0.0050 0.5391 0.4000
0.0050 7.3476 3.0000 2.0000 0.0050 3.0000 0.4000
0.0050 5.8546 2.3907 2.0000 0.0050 2.3907 0.3999
0.0050 0.3920 0.1600 2.0000 0.0050 0.1600 0.4000
1.0000 2.0985 0.8567 2.0000 - 0.8567 0.4000
---------------------------------------- ------------------------------
Number of layers: 10
Number of parameters: 29
Best model misfit: 0.0006
--------------------------------------------------------------------------------
Note that CMA-ES may not always work well. CMA-ES can be compared to a gradient based optimizer as it tries to generate more samples along the direction of the steepest descent.
from evodcinv.
Thanks! much appreciated. Encountered the following for a VDCMA run, possible bug?
I was using population size=100 and iternations = 1000
Also, is there a way to isolate an ensemble of models within a certain threshold of the minimum misfit, say within 20% of the minimum misfit?
What does this do?
res = res.threshold(0.1)
Also, rather than showing all models with 'show=all' as given below, is there a way to isolate and show the models that were extracted within the earlier-mentioned threshold; i.e. say within 20% of the minimum misfit?
res.plot_model( "vs", zmax=zmax, show="all", ax=ax[0], plot_args={"cmap": cmap},
Thank you!
from evodcinv.
I guess VDCMA does not like when all models have infinite misfit (happens when constraints are not satisfied). I will need to look into that.
The method threshold
does exactly what you want: it extracts models that have misfits lower than the input value. If you want the threshold to be 20% of the minimum misfit:
value = 1.2 * res.misfit
Note that:
res = res.threshold(value)
overwrites the result variable res
with the result filtered by the method threshold
. You may want to save the filtered results in another variable, e.g.:
res2 = res.threshold(value)
Then, invoking the method plot_model
from the filtered result will only show the models that satisfied your threshold criterion. Alternatively, you can pipe the methods as follows:
res.threshold(value).plot_models("vs", zmax=zmax, show="all", ax=ax[0], plot_args={"cmap": cmap)
from evodcinv.
Neat! Thanks very much for your help!
from evodcinv.
I still haven't been able to get a good inversion result for a synthetic model I'm trying and it looks like the following - a simple model with a velocity gradient. I tried all algorithms with population size = 100 to 150 and iterations 500 to 2500. Target computational costs don't let me go above these parameters. Any suggestions to improve the inversion? Also, the plotting routines are quite computationally intensive. I wonder if it can be made more efficient. Thanks!
velocity_model = np.array([ [ 0.001 , 0.346 , 0.2 , 1.4099 ], [ 0.001 , 0.4786 , 0.2767 , 1.4116 ], [ 0.001 , 0.6113 , 0.3533 , 1.4132 ], [ 0.001 , 0.7439 , 0.43 , 1.4145 ], [ 0.001 , 0.8765 , 0.5067 , 1.4158 ], [ 0.001 , 1.0092 , 0.5833 , 1.4169 ], [ 0.001 , 1.1418 , 0.66 , 1.418 ], [ 0.001 , 1.2744 , 0.7367 , 1.419 ], [ 0.001 , 1.4071 , 0.8133 , 1.42 ], [ 0.001 , 1.5397 , 0.89 , 1.4209 ], [ 0.001 , 1.6723 , 0.9667 , 1.4218 ], [ 0.001 , 1.805 , 1.0433 , 1.4226 ], [ 0.001 , 1.9376 , 1.12 , 1.4234 ], [ 0.001 , 2.0702 , 1.1967 , 1.4242 ], [ 0.001 , 2.2029 , 1.2733 , 1.425 ], [ 0.001 , 2.3355 , 1.35 , 1.4257 ], [ 0.001 , 2.4681 , 1.4267 , 1.4264 ], [ 0.001 , 2.6008 , 1.5033 , 1.4271 ], [ 0.001 , 2.7334 , 1.58 , 1.4278 ], [ 0.001 , 2.866 , 1.6567 , 1.4285 ], [ 0.001 , 2.9987 , 1.7333 , 1.4291 ], [ 0.001 , 3.1313 , 1.81 , 1.4298 ], [ 0.001 , 3.2639 , 1.8867 , 1.4304 ], [ 0.001 , 3.3966 , 1.9633 , 1.431 ], [ 0.001 , 3.5292 , 2.04 , 1.4316 ], [ 0.001 , 3.6618 , 2.1167 , 1.4322 ], [ 0.001 , 3.7945 , 2.1933 , 1.4328 ], [ 0.001 , 3.9271 , 2.27 , 1.4334 ], [ 0.001 , 4.0597 , 2.3467 , 1.4339 ], [ 0.001 , 4.1924 , 2.4233 , 1.4345 ], [ 0.001 , 4.325 , 2.5 , 1.435 ], ])
Fundamental mode Rayleigh Group dispersion curve from disba for this model:
Period:
array([ 0.1 , 0.10476158, 0.10974988, 0.1149757 , 0.12045035, 0.12618569, 0.13219411, 0.13848864, 0.14508288, 0.15199111, 0.15922828, 0.16681005, 0.17475284, 0.18307383, 0.19179103, 0.2009233 , 0.21049041, 0.22051307, 0.23101297, 0.24201283, 0.25353645, 0.26560878, 0.27825594, 0.29150531, 0.30538555, 0.31992671, 0.33516027, 0.35111917, 0.36783798, 0.38535286, 0.40370173, 0.42292429, 0.44306215, 0.46415888, 0.48626016, 0.5094138 , 0.53366992, 0.55908102, 0.58570208, 0.61359073, 0.64280731, 0.67341507, 0.70548023, 0.7390722 , 0.77426368, 0.81113083, 0.84975344, 0.89021509, 0.93260335, 0.97700996, 1.02353102, 1.07226722, 1.12332403, 1.17681195, 1.23284674, 1.29154967, 1.35304777, 1.41747416, 1.48496826, 1.55567614, 1.62975083, 1.70735265, 1.78864953, 1.87381742, 1.96304065, 2.05651231, 2.15443469, 2.25701972, 2.36448941, 2.47707636, 2.59502421, 2.71858824, 2.84803587, 2.98364724, 3.12571585, 3.27454916, 3.43046929, 3.59381366, 3.76493581, 3.94420606, 4.1320124 , 4.32876128, 4.53487851, 4.75081016, 4.97702356, 5.21400829, 5.46227722, 5.72236766, 5.9948425 , 6.28029144, 6.57933225, 6.8926121 , 7.22080902, 7.56463328, 7.92482898, 8.30217568, 8.69749003, 9.11162756, 9.54548457, 10. ])
Velocity:
array([1.09786535, 1.20770009, 1.30751688, 1.39670049, 1.47560087, 1.54520622, 1.60664691, 1.66103997, 1.70934501, 1.7524971 , 1.79118683, 1.82605584, 1.85757589, 1.88622391, 1.91242616, 1.93634936, 1.95832355, 1.97861568, 1.99732756, 2.01468714, 2.03079722, 2.04576357, 2.05973416, 2.07277005, 2.08493231, 2.09632235, 2.10704698, 2.11707513, 2.12650911, 2.13536061, 2.14373248, 2.15163577, 2.15908177, 2.1660324 , 2.17268075, 2.1789452 , 2.18487845, 2.19044092, 2.19573366, 2.20076534, 2.20553736, 2.21001281, 2.21428938, 2.21832549, 2.2221736 , 2.22583808, 2.22927476, 2.23253889, 2.23567875, 2.23865024, 2.24150667, 2.24420385, 2.24669491, 2.24912634, 2.25145642, 2.2536359 , 2.25576521, 2.25775087, 2.25964183, 2.26144286, 2.26315383, 2.26477708, 2.26631494, 2.26776975, 2.26923784, 2.27057825, 2.27183783, 2.27302139, 2.2742255 , 2.27530642, 2.27636073, 2.27734372, 2.2783025 , 2.27923952, 2.28015721, 2.28090897, 2.28173825, 2.28250089, 2.28324658, 2.28387826, 2.28454512, 2.285195 , 2.28573319, 2.28635639, 2.28686785, 2.28741698, 2.28785676, 2.2883816 , 2.28879949, 2.28920765, 2.2896535 , 2.2900422 , 2.29042116, 2.29074295, 2.29110732, 2.2914145 , 2.29171193, 2.2920045 , 2.2922922 , 2.29257016])
from evodcinv.
I inverted your group velocities with 20 layers:
import numpy as np
from evodcinv import EarthModel, Layer, Curve
import matplotlib.pyplot as plt
period = np.array([ 0.1 , 0.10476158, 0.10974988, 0.1149757 , 0.12045035, 0.12618569, 0.13219411, 0.13848864, 0.14508288, 0.15199111, 0.15922828, 0.16681005, 0.17475284, 0.18307383, 0.19179103, 0.2009233 , 0.21049041, 0.22051307, 0.23101297, 0.24201283, 0.25353645, 0.26560878, 0.27825594, 0.29150531, 0.30538555, 0.31992671, 0.33516027, 0.35111917, 0.36783798, 0.38535286, 0.40370173, 0.42292429, 0.44306215, 0.46415888, 0.48626016, 0.5094138 , 0.53366992, 0.55908102, 0.58570208, 0.61359073, 0.64280731, 0.67341507, 0.70548023, 0.7390722 , 0.77426368, 0.81113083, 0.84975344, 0.89021509, 0.93260335, 0.97700996, 1.02353102, 1.07226722, 1.12332403, 1.17681195, 1.23284674, 1.29154967, 1.35304777, 1.41747416, 1.48496826, 1.55567614, 1.62975083, 1.70735265, 1.78864953, 1.87381742, 1.96304065, 2.05651231, 2.15443469, 2.25701972, 2.36448941, 2.47707636, 2.59502421, 2.71858824, 2.84803587, 2.98364724, 3.12571585, 3.27454916, 3.43046929, 3.59381366, 3.76493581, 3.94420606, 4.1320124 , 4.32876128, 4.53487851, 4.75081016, 4.97702356, 5.21400829, 5.46227722, 5.72236766, 5.9948425 , 6.28029144, 6.57933225, 6.8926121 , 7.22080902, 7.56463328, 7.92482898, 8.30217568, 8.69749003, 9.11162756, 9.54548457, 10. ])
group = np.array([1.09786535, 1.20770009, 1.30751688, 1.39670049, 1.47560087, 1.54520622, 1.60664691, 1.66103997, 1.70934501, 1.7524971 , 1.79118683, 1.82605584, 1.85757589, 1.88622391, 1.91242616, 1.93634936, 1.95832355, 1.97861568, 1.99732756, 2.01468714, 2.03079722, 2.04576357, 2.05973416, 2.07277005, 2.08493231, 2.09632235, 2.10704698, 2.11707513, 2.12650911, 2.13536061, 2.14373248, 2.15163577, 2.15908177, 2.1660324 , 2.17268075, 2.1789452 , 2.18487845, 2.19044092, 2.19573366, 2.20076534, 2.20553736, 2.21001281, 2.21428938, 2.21832549, 2.2221736 , 2.22583808, 2.22927476, 2.23253889, 2.23567875, 2.23865024, 2.24150667, 2.24420385, 2.24669491, 2.24912634, 2.25145642, 2.2536359 , 2.25576521, 2.25775087, 2.25964183, 2.26144286, 2.26315383, 2.26477708, 2.26631494, 2.26776975, 2.26923784, 2.27057825, 2.27183783, 2.27302139, 2.2742255 , 2.27530642, 2.27636073, 2.27734372, 2.2783025 , 2.27923952, 2.28015721, 2.28090897, 2.28173825, 2.28250089, 2.28324658, 2.28387826, 2.28454512, 2.285195 , 2.28573319, 2.28635639, 2.28686785, 2.28741698, 2.28785676, 2.2883816 , 2.28879949, 2.28920765, 2.2896535 , 2.2900422 , 2.29042116, 2.29074295, 2.29110732, 2.2914145 , 2.29171193, 2.2920045 , 2.2922922 , 2.29257016])
model = EarthModel()
for _ in range(20):
model.add(Layer(0.0015, [0.1, 3.0]))
model.configure(
optimizer="cpso",
misfit="rmse",
density=lambda vp: 2.0,
optimizer_args={
"popsize": 20,
"maxiter": 200,
"workers": -1,
"seed": 0,
},
)
curves = [Curve(period, group, 0, "rayleigh", "group")]
res = model.invert(curves, maxrun=3)
res.plot_curve(period, 0, "rayleigh", "group", show="best")
plt.semilogx(period, group, ls="--")
from evodcinv.
Yes, the dispersion curve gets recovered well but not the velocity profile? I was doing this to kind of validate the inversions.
from evodcinv.
The solutions of inversion problems are, in general, non-unique (there is an infinite number of solutions that can match your data). We can't expect the code to find the exact solution, even for synthetic cases. To get "better" results, we need to provide prior information (via better boundary constraints for instance, or additional regularization terms). Without any prior information, the code, as "dumb" as it is, returns one model that fits your data.
from evodcinv.
Indeed, and is there a way to modify the misfit function? Alternatively, it's easier to quantify uncertainty if I can retrieve all the models within a certain threshold of the misfit. Is it possible to retrieve models after res.threshold()?
from evodcinv.
You can look at the option "extra_terms" in "configure", this option allows to add additional misfit terms (e.g., to penalize models).
res.threshold
simply filters out all the models that do not satisfy your threshold criterion. All the acceptable models are then accessible in res.models
.
from evodcinv.
Brilliant, thank you!
from evodcinv.
I've updated the code and improved the option increasing_velocity
(as usual, pip install evodcinv -U
). The models are now initialized following this paper. Interestingly, this options seems to work quite well with Differential Evolution for the three sample problems I have tested (again, that may not be true for all problems).
model = EarthModel()
for _ in range(20):
model.add(Layer(0.0015, [0.1, 3.0]))
model.configure(
optimizer="de",
misfit="rmse",
density=lambda vp: 2.0,
optimizer_args={
"popsize": 50,
"maxiter": 500,
"workers": -1,
"seed": 0,
"strategy": "rand1bin",
},
increasing_velocity=True,
)
curves = [Curve(period, group, 0, "rayleigh", "group")]
res = model.invert(curves, maxrun=3)
--------------------------------------------------------------------------------
Best model out of 75000 models (3 runs)
Velocity model Model parameters
---------------------------------------- ------------------------------
d vp vs rho d vs nu
[km] [km/s] [km/s] [g/cm3] [km] [km/s] [-]
---------------------------------------- ------------------------------
0.0015 0.4773 0.2389 2.0000 0.0015 0.2389 0.3329
0.0015 0.6175 0.3148 2.0000 0.0015 0.3148 0.3243
0.0015 0.7540 0.4099 2.0000 0.0015 0.4099 0.2903
0.0015 0.9715 0.5724 2.0000 0.0015 0.5724 0.2341
0.0015 1.2553 0.7678 2.0000 0.0015 0.7678 0.2011
0.0015 1.4520 0.7960 2.0000 0.0015 0.7960 0.2851
0.0015 1.6085 0.8779 2.0000 0.0015 0.8779 0.2879
0.0015 1.8379 1.0229 2.0000 0.0015 1.0229 0.2756
0.0015 2.2715 1.0593 2.0000 0.0015 1.0593 0.3611
0.0015 2.0950 1.1280 2.0000 0.0015 1.1280 0.2959
0.0015 2.5391 1.1726 2.0000 0.0015 1.1726 0.3644
0.0015 2.4278 1.3305 2.0000 0.0015 1.3305 0.2854
0.0015 2.6118 1.4435 2.0000 0.0015 1.4435 0.2801
0.0015 2.6806 1.5059 2.0000 0.0015 1.5059 0.2694
0.0015 3.2776 1.5632 2.0000 0.0015 1.5632 0.3528
0.0015 3.0416 1.6856 2.0000 0.0015 1.6856 0.2784
0.0015 3.3495 1.8566 2.0000 0.0015 1.8566 0.2783
0.0015 3.7398 2.1915 2.0000 0.0015 2.1915 0.2385
0.0015 3.9894 2.3113 2.0000 0.0015 2.3113 0.2474
1.0000 4.4713 2.4880 2.0000 - 2.4880 0.2758
---------------------------------------- ------------------------------
Number of layers: 20
Number of parameters: 59
Best model misfit: 0.0002
--------------------------------------------------------------------------------
from evodcinv.
Thanks. I guess I have to reinstall the codes if you've updated them? increasing_velocity=True ensures that the inversion produces a best-fitting positive velocity gradient? So to ensure that I am not omitting structure with low velocity layers, I have to run a separate inversion with increasing_velocity=False and compare the misfits as a strategy?
Yes, I read your paper and that's what got me interested in your codes.
from evodcinv.
Yes, please run the command pip install evodcinv -U
to update the package.
I guess that can be a strategy as well. I would also recommend not to invert for that many layers due to the non-unicity of the inversion problem. In general, simpler models are preferable (e.g., 10 would be enough).
from evodcinv.
You can look at the option "extra_terms" in "configure", this option allows to add additional misfit terms (e.g., to penalize models).
Here an example where we assume we have a linear prior model (Vs(z=0m)=0.1 km/s, Vs(z=40m)=3 km/s). with a regularization factor of 0.001 (can be tuned):
model = EarthModel()
for _ in range(10):
model.add(Layer(0.003, [0.1, 3.0]))
def prior(x):
vel = model.transform(x)
z = np.insert(vel[:-1, 0].cumsum(), 0, 0.0)
vprior = np.interp(z, [0.0, 0.04], [0.1, 3.0])
return 1.0e-3 * np.square(vel[:, 2] - vprior).sum()
model.configure(
optimizer="cpso",
misfit="norm2",
density=lambda vp: 2.0,
optimizer_args={
"popsize": 40,
"maxiter": 500,
"workers": -1,
"seed": 0,
},
extra_terms=[prior],
)
curves = [Curve(period, group, 0, "rayleigh", "group")]
res = model.invert(curves, maxrun=3)
res.plot_curve(period, 0, "rayleigh", "group", show="best")
plt.semilogx(period, group, ls="--")
Note that I am here using a "norm2"
misfit function.
--------------------------------------------------------------------------------
Best model out of 60000 models (3 runs)
Velocity model Model parameters
---------------------------------------- ------------------------------
d vp vs rho d vs nu
[km] [km/s] [km/s] [g/cm3] [km] [km/s] [-]
---------------------------------------- ------------------------------
0.0030 0.4892 0.2925 2.0000 0.0030 0.2925 0.2217
0.0030 0.9116 0.5113 2.0000 0.0030 0.5113 0.2705
0.0030 1.2563 0.7523 2.0000 0.0030 0.7523 0.2205
0.0030 1.7016 0.9394 2.0000 0.0030 0.9394 0.2808
0.0030 2.1432 1.0716 2.0000 0.0030 1.0716 0.3333
0.0030 2.3475 1.2724 2.0000 0.0030 1.2724 0.2920
0.0030 2.9727 1.5176 2.0000 0.0030 1.5176 0.3238
0.0030 3.1543 1.6221 2.0000 0.0030 1.6221 0.3203
0.0030 3.5203 1.9024 2.0000 0.0030 1.9024 0.2938
1.0000 4.4075 2.4936 2.0000 - 2.4936 0.2646
---------------------------------------- ------------------------------
Number of layers: 10
Number of parameters: 29
Best model misfit: 0.0004
--------------------------------------------------------------------------------
from evodcinv.
I am not sure if I am following what's happening within def prior(x)
. If you can add some comments, that'd be great. Ultimately, I'm guessing you are creating a misfit term that is norm2 + extra_terms
, correct? In this case, rather than fixing the velocity profile to a prior, I think it's useful to add a penalty term such that the velocity jumps across layers are restricted. For example see the second term in eq(2) here: https://www.sciencedirect.com/science/article/pii/S0040195117302482. The philosophy behind this is that drastic velocity changes don't occur across layers particularly if you have a model parameterisation with fine layers.
from evodcinv.
extra_terms
is a list of functions that take a single input (the model parameter vector x
) and returning a misfit value added to the main one (the one that measures the misfit with data points).
evodcinv
gives you full control of the misfit function which can be written for a model parameter
with extra_terms
.
I just updated evodcinv
(please update to version 2.2.0: pip install evodcinv -U
). This version adds a small factory of extra misfit functions that can be used, including your roughness criterion which can be used as follows:
from evodcinv import factory
model = EarthModel()
for _ in range(20):
model.add(Layer(0.0015, [0.1, 3.0]))
model.configure(
optimizer="de",
misfit="norm2",
density=lambda vp: 2.0,
optimizer_args={
"popsize": 50,
"maxiter": 500,
"workers": -1,
"seed": 0,
},
increasing_velocity=True,
extra_terms=[
lambda x: factory.smooth(x, alpha=1.0e-3),
lambda x: factory.prior(x, [0.0, 0.04], [0.1, 3.0], alpha=1.0e-3)
],
)
curves = [Curve(period, group, 0, "rayleigh", "group")]
res = model.invert(curves, maxrun=3)
--------------------------------------------------------------------------------
Best model out of 75000 models (3 runs)
Velocity model Model parameters
---------------------------------------- ------------------------------
d vp vs rho d vs nu
[km] [km/s] [km/s] [g/cm3] [km] [km/s] [-]
---------------------------------------- ------------------------------
0.0015 0.6590 0.2996 2.0000 0.0015 0.2996 0.3697
0.0015 0.6590 0.3043 2.0000 0.0015 0.3043 0.3646
0.0015 0.7904 0.4359 2.0000 0.0015 0.4359 0.2815
0.0015 1.0767 0.5475 2.0000 0.0015 0.5475 0.3256
0.0015 1.2365 0.6761 2.0000 0.0015 0.6761 0.2868
0.0015 1.5246 0.7643 2.0000 0.0015 0.7643 0.3322
0.0015 1.5583 0.8654 2.0000 0.0015 0.8654 0.2770
0.0015 1.6613 0.9901 2.0000 0.0015 0.9901 0.2245
0.0015 2.2096 1.1111 2.0000 0.0015 1.1111 0.3308
0.0015 2.0777 1.2241 2.0000 0.0015 1.2241 0.2342
0.0015 2.1615 1.3197 2.0000 0.0015 1.3197 0.2029
0.0015 2.3499 1.4282 2.0000 0.0015 1.4282 0.2071
0.0015 2.8160 1.5777 2.0000 0.0015 1.5777 0.2712
0.0015 2.8547 1.7478 2.0000 0.0015 1.7478 0.2002
0.0015 3.4989 1.8622 2.0000 0.0015 1.8622 0.3024
0.0015 3.5972 1.9435 2.0000 0.0015 1.9435 0.2939
0.0015 3.3907 2.0421 2.0000 0.0015 2.0421 0.2154
0.0015 3.6506 2.1789 2.0000 0.0015 2.1789 0.2233
0.0015 3.8436 2.3452 2.0000 0.0015 2.3452 0.2034
1.0000 4.4124 2.4914 2.0000 - 2.4914 0.2660
---------------------------------------- ------------------------------
Number of layers: 20
Number of parameters: 59
Best model misfit: 0.0014
--------------------------------------------------------------------------------
from evodcinv.
That's brilliant! Thank you! We should write a paper together to demonstrate the Utility of evodcinv in different applications, not just geophysics!
from evodcinv.
I think there should be an option to run "maxrun=" with different seeds. Does it make sense to run multiple inversions with the same seed? Shouldn't the starting point be different for different runs? Thanks!
from evodcinv.
Thanks, can you please explain the functionality of "seed":0 in model.configure(). Is it for the first run?
from evodcinv.
The purpose of this option is reproducibility, so that every time you run the same script, you get the same output. It's calling np.random.seed
in the backend before the inversions.
from evodcinv.
Hi, there are instances when an inversion run terminates before reaching 100% (i.e. population size x iterations) if the misfit approaches a small value, e.g. 10^-9, early on. In those instances, not all the model arrays in res (i.e. res.models[i]) are populated. Instead, beyond the model that approached the minimum misfit, all model arrays in res will be populated with zeros. It would be good to trim the res.models and everything else in res by deleting unused array locations that are filled with zeros, for example with np.where method (actually the nonzero value is len(res.xs)), so that res.threshold can be used with statistical measures without having to include intermediate code by hand. Thanks!
from evodcinv.
I think the default minimum misfit value in stochopy
is 1.0e-8, but I don't know how it's possible to get such a low misfit when inverting dispersion curves, unless you are inverting only few data points. Anyway, I will see what I can do, in the meantime, I would recommend to avoid such situation by setting ftol
to 0.0 in optimizer_args
(didn't try, but it should work).
from evodcinv.
Here's an example dispersion curve I've tried producing 10^-9 misfit. As you can see, this spans lower than usual periods. I just realized that the inversion is fitting only the first couple of points of the dispersion curve and hence the very low misfit. Why would it do that?
length of period array: 100
array([0.00716448, 0.00727753, 0.00739235, 0.00750899, 0.00762746,
0.00774781, 0.00787006, 0.00799423, 0.00812036, 0.00824849,
0.00837863, 0.00851083, 0.00864511, 0.00878152, 0.00892007,
0.00906081, 0.00920377, 0.00934899, 0.0094965 , 0.00964633,
0.00979853, 0.00995314, 0.01011018, 0.0102697 , 0.01043173,
0.01059632, 0.01076351, 0.01093334, 0.01110585, 0.01128107,
0.01145907, 0.01163987, 0.01182352, 0.01201007, 0.01219957,
0.01239205, 0.01258758, 0.01278618, 0.01298792, 0.01319285,
0.013401 , 0.01361245, 0.01382722, 0.01404539, 0.014267 ,
0.0144921 , 0.01472076, 0.01495302, 0.01518895, 0.0154286 ,
0.01567204, 0.01591931, 0.01617049, 0.01642562, 0.01668479,
0.01694804, 0.01721545, 0.01748707, 0.01776298, 0.01804325,
0.01832794, 0.01861711, 0.01891086, 0.01920923, 0.01951231,
0.01982018, 0.0201329 , 0.02045056, 0.02077323, 0.02110099,
0.02143392, 0.02177211, 0.02211563, 0.02246457, 0.02281902,
0.02317905, 0.02354477, 0.02391626, 0.02429361, 0.02467692,
0.02506627, 0.02546177, 0.02586351, 0.02627158, 0.02668609,
0.02710715, 0.02753484, 0.02796929, 0.02841059, 0.02885885,
0.02931419, 0.02977671, 0.03024652, 0.03072376, 0.03120852,
0.03170092, 0.0322011 , 0.03270917, 0.03322526, 0.03374949])
length of group velocity array: 100
array([0.04187945, 0.0522467 , 0.06277752, 0.0734745 , 0.08434025,
0.09537744, 0.11581332, 0.13734638, 0.15272252, 0.15786487,
0.16308835, 0.16839425, 0.17378387, 0.17925852, 0.18481956,
0.19046833, 0.19620623, 0.19589644, 0.19506623, 0.19422292,
0.1933663 , 0.19249617, 0.1916123 , 0.19071449, 0.18980252,
0.18887616, 0.18793518, 0.18697935, 0.18600845, 0.18502222,
0.18402043, 0.18300284, 0.18196919, 0.18091923, 0.17856765,
0.17617895, 0.17375257, 0.17128791, 0.16878436, 0.16855932,
0.17066695, 0.17280784, 0.17498251, 0.17719149, 0.17943532,
0.18171456, 0.18402976, 0.18638148, 0.18877032, 0.19119684,
0.19366165, 0.19616535, 0.20207335, 0.20888199, 0.21579806,
0.22282325, 0.22995929, 0.23720792, 0.24110632, 0.24461078,
0.24817054, 0.25178646, 0.25545943, 0.25919035, 0.26298014,
0.26682973, 0.27074005, 0.27471208, 0.27874677, 0.28262544,
0.28103002, 0.27940942, 0.27776326, 0.27609112, 0.2743926 ,
0.27266729, 0.27091474, 0.26913455, 0.26732627, 0.2660577 ,
0.2660577 , 0.2660577 , 0.2660577 , 0.2660577 , 0.2660577 ,
0.2660577 , 0.2660577 , 0.26676526, 0.26941959, 0.27211581,
0.27485457, 0.27763654, 0.2804624 , 0.28333285, 0.28624859,
0.28921034, 0.29221882, 0.29527476, 0.29837892, 0.30153206])
from evodcinv.
Thanks, that appeared to have fixed the issue. Also, I realized that dispersion curve predictions are not stored during the inversion and are recalculated within the plotting routines. This is a bit inefficient as evodcinv is then computing the same dispersion curves twice. Why not store dispersion curves within the inversion itself and use them in the plotting routines?
from evodcinv.
The first reason is API limitation: most optimization libraries (e.g.,
scipy
), if not all, simply only allow to calculate and store the misfit value for a model. Although I am the developer ofstochopy
, I would need to make some substantial changes in the API to allow to store more than that. The second reason is thatevodcinv
only calculates the dispersion velocities at discrete input periods. When plotting results, we usually want to plot the "full" dispersion curves (i.e., at periods different that of the data points), so storing the dispersion curves calculated during the inversion doesn't look as useful given that constraint. Anyway, after some benchmarks, the performance issues of the plotting routines are not due to the recalculation of the dispersion curves (it should only take a few seconds, especially after thresholding), there seem to be some issues with sequential plotting of thousands of lines withmatplotlib
.
Thanks for that explanation.
from evodcinv.
Are there reasons why a run would quit half way through other than approaching a machine precision minimum as discussed before?
from evodcinv.
It depends on the algorithm I guess. For instance, for Differential Evolution, it may be caused by a lack of diversity in the population at a given iteration, i.e., all the models are too similar so recombining models would be useless.
from evodcinv.
@keurfonluu is it possible to assemble the minimum misfit models across from multiple runs instead of assembling the lowest misfit models from the run that has the minimum misfit? The former should give you more precise models than the latter especially if the minimum misfit is not that different across multiple runs.
from evodcinv.
Thanks, exactly what I was looking for.
from evodcinv.
Related Issues (11)
- 2): Symbol not found: _PyBytes_FromString HOT 11
- Controlling size of inversion resuls HOT 3
- Multi-mode inversion yields models that do not predict the data HOT 4
- Adding a constraint for velocity increasing with depth HOT 4
- Support for Scholte wave inversion HOT 4
- Poisson's ratio HOT 5
- FileNotFoundError
- Example of plot results HOT 2
- Example not working HOT 1
- model.invert: concatenation of run results HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from evodcinv.