Comments (31)
from deepwave.
I want to insert a unet into scalar_born function as the parameters("scatter"), but it seems failed. I have tested the unet, the unet is correct.
from deepwave.
it shows that the parameters need to do autograd have been changed by the inplace operation
from deepwave.
this is the code I use to cooperate the deepwave
from deepwave.
do you have any idea of this?😭
from deepwave.
from deepwave.
I have done what you suggest but still have a problem. Showing like this:
RuntimeError Traceback (most recent call last)
in
47 )
48 epoch_loss += loss.item()
---> 49 loss.backward()
50 # loss.backward(retain_graph=True)
51 optimiser.step()
1 frames
/usr/local/lib/python3.8/dist-packages/torch/autograd/init.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables, inputs)
195 # some Python versions print out the first line of a multi-line function
196 # calls in the traceback and some print out the last line
--> 197 Variable.execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
198 tensors, grad_tensors, retain_graph, create_graph, inputs,
199 allow_unreachable=True, accumulate_grad=True) # Calls into the C++ engine to run the backward pass
RuntimeError: Trying to backward through the graph a second time (or directly access saved tensors after they have already been freed). Saved intermediate values of the graph are freed when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to backward through the graph a second time or if you need to access saved tensors after calling backward.
from deepwave.
if I change back to loss.backward(retain_graph=True), it will generate another problem like this:
RuntimeError Traceback (most recent call last)
in
48 epoch_loss += loss.item()
49 # loss.backward()
---> 50 loss.backward(retain_graph=True)
51 optimiser.step()
52 # scatter.detach()
1 frames
/usr/local/lib/python3.8/dist-packages/torch/autograd/init.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables, inputs)
195 # some Python versions print out the first line of a multi-line function
196 # calls in the traceback and some print out the last line
--> 197 Variable.execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
198 tensors, grad_tensors, retain_graph, create_graph, inputs,
199 allow_unreachable=True, accumulate_grad=True) # Calls into the C++ engine to run the backward pass
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1, 64, 1, 1]] is at version 6; expected version 5 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True)
from deepwave.
what I want to do is use nn to train the scatter, just what you said. But it always comes out this problem, so I can not update the parameters inside the nn.
from deepwave.
from deepwave.
from deepwave.
from deepwave.
Yes! it can run now. But a quick question, why do we need to update each batch rather than each epoch? I read the example in deepwave documents, they are update each epoch
from deepwave.
If we also update the background velocity using nn, will this work? do we need to set scatter.requires_grad = False? or they can update simultaneously
from deepwave.
That is good news.
If you only wish to run your network and modify its parameters once each epoch, rather than each batch, then I think your code can be modified to achieve that. It might reduce runtime (by running your network less frequently) and make updates more stable (as they will be based on the gradients from an entire epoch rather than just one batch), but will probably also make convergence take longer as your model will be updated much less frequently, and will lose the randomness benefit of small batch updates. If you wish to do it, then something like this might work:
for epoch in range(n_epochs):
epoch_loss = 0
scatter = net(v_mig1.unsqueeze(0).unsqueeze(0))
scatter1 = scatter.detach().squeeze(0).squeeze(0)
scatter1.requires_grad_()
optimiser1 = torch.optim.SGD([scatter1], lr=1)
optimiser1.zero_grad()
for batch in range(n_batch):
batch_start = batch * n_shots_per_batch
batch_end = min(batch_start + n_shots_per_batch, n_shots)
if batch_end <= batch_start:
continue
s = slice(batch_start, batch_end)
out = scalar_born(v_mig1, scatter1, dx, dt,
source_amplitudes=source_amplitudes[s],
source_locations=source_locations[s],
receiver_locations=receiver_locations[s],
pml_freq=freq)
loss = (1e9 * loss_fn(out[-1] * mask[s],
observed_scatter_masked[s]))
epoch_loss += loss.item()
loss.backward()
optimiser1.step() # update scatter1
scatter1 = scatter1.detach().unsqueeze(0).unsqueeze(0)
# train net to produce scatter1
for it in range(n_its):
optimiser.zero_grad()
scatter = net(v_mig1.unsqueeze(0).unsqueeze(0))
loss = loss_fn(scatter, scatter1)
loss.backward()
optimiser.step()
There may be other, perhaps more elegant, ways. This one separates the estimation of the scattering model each epoch from running scalar_born
. Within the loop over batches the gradients will only flow back to scatter1
- they will not flow back into your network - so the intermediate states of your network that were saved during the forward pass for the backward pass are not used inside this loop (and we avoid the problem of them being freed when loss.backward()
is called, which caused your issue). After the loop over batches we then compare scatter
(produced by your network) with scatter1
(updated after an epoch of Deepwave) and update your network parameter to try to match it.
In most cases the cost of running your neural network will be insignificant compared to the cost of running Deepwave, however, and so calling your neural network each batch (the way you currently do in your working code) will not substantially affect runtime (and avoids the complications in my code above of having multiple optimisers, etc.). You can still only update its parameters every epoch, rather than every batch, if you wish, by moving optimiser.zero_grad()
and optimiser.step()
outside the loop over batches. This will cause gradients to accumulate over batches and the accumulated update will only be applied to your network model parameters once per epoch.
A learning rate of 1 (combined with the large scaling applied to loss
) might be appropriate for SGD when the velocity/scattering models are being updated directly, but when the parameters being updated are those of a neural network, then more traditional learning rates might be appropriate - you should check the amplitudes of the gradients of your network's parameters to see if they are reasonable.
Regarding your second question, yes, you can also update velocity (and also source amplitude, if you wish) simultaneously. You will need to set requires_grad=True
for it, and add it to the list of parameters being updated by your optimiser. I suggest that if you do invert for velocity, then you use a method such as one of the ones discussed in the Deepwave examples, for limiting the potential ranges of velocities.
from deepwave.
Thanks! That's quite useful, I will try as you suggest!!!
from deepwave.
Hi,
if I just update the background velocity model and use scalar_born function, the loss is unchanged.
Actually, the background velocity heavily affects the RTM imaging performance, and it will also affect the scatter wavefields. The phenomenon of loss unchanged is quite confusing me if we input different background velocities.
from deepwave.
Can you explain it a little bit?
from deepwave.
Can you show me the code you used to conclude that the loss is unchanged when you update the background velocity model?
from deepwave.
And do you mean that the loss didn't change when you used completely different velocity models, or only that it didn't change over iterations when you were inverting for the velocity model? If the latter, have you checked that the velocity model actually changed over iterations?
from deepwave.
I use scalar_born to update the net, the net will generate velocity, I feed this velocity into the scalar_born. the loss for each epoch is not changed.like this
from deepwave.
You mean the loss is unchanged but the velocity has already updated?
from deepwave.
from deepwave.
from deepwave.
from deepwave.
from deepwave.
from deepwave.
from deepwave.
from deepwave.
Hi Weilin,
I am closing this issue, but please feel free to reopen it, or to create a new issue, if you have any more problems or questions.
from deepwave.
from deepwave.
Related Issues (20)
- TypeError: 'module' object is not callable HOT 9
- Asking for Propagator function in the newest version of Deepwave HOT 3
- Error in executing deepwave in MAC HOT 17
- How to calculate RTM using deepwave HOT 11
- Try the first-order acoustic equation propagation HOT 2
- scalar_born memory issue HOT 4
- 3D forward modelling HOT 5
- Incorrect output from DistributedDataParallel HOT 6
- It seams the scalar function cannot generate the ground roll when setting the free surface HOT 4
- Calculated Hessian for the elastic example. It gives zero values HOT 2
- I was unable to complete compilation HOT 5
- Apply deepwave to ultrasound HOT 13
- Generate the waveform data HOT 3
- How can I get the file called scalar2d_gpu_iso_4_float and scalar2d_gpu_iso_4_float.cp38-win_amd64 HOT 3
- How to write a propagator by scalar with the newest version HOT 3
- looked at the source code HOT 8
- how to simulate a source that is not point source, but has an arbitrarily spatial distribution? HOT 6
- Elastic FWI parameterization (Impedance) HOT 2
- Distributed (multi-GPU) execution HOT 12
- How to generate reverse time migration HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from deepwave.