I have checked that, in pytorch, the std is computed as sqrt(var + eps). But the Synch

a question about the highlight "use sqrt(max(var, eps)) instead of sqrt(var + eps)" about synchronized-batchnorm-pytorch HOT 4 CLOSED

vacancy commented on August 18, 2024

a question about the highlight "use sqrt(max(var, eps)) instead of sqrt(var + eps)"

from synchronized-batchnorm-pytorch.

Comments (4)

vacancy commented on August 18, 2024

Yes, in the latest version of PyTorch, this is computed by sqrt(var + eps).

Back in the days when I was developing this package, I was trying out different combinations of possible implementations to match the behavior (not only forward, but also backward...) So using clamp was the final decision...

It is worth noting that, because of the different ways of computing the gradient (in pytorch, BatchNorm is a fused operator), and also numerical precision issues, neither of these two implementations will perfectly match the pytorch BatchNorm (especially for the backward pass).

It might be a better idea to change it to sqrt(var+eps) as it at least matches the forward pass better... But due to all these historical issues and the backward compatibility, I don't think I should change the behavior of this module.

I just added a new option for this module so that users can change this default behavior:

import sync_batchnorm

sync_batchnorm.set_sbn_eps_mode('clamp')
sync_batchnorm.set_sbn_eps_mode('plus')

from synchronized-batchnorm-pytorch.

louyj136 commented on August 18, 2024

thanks very much for helping! I just checkd the forward pass. But in my knowledge, once forward pass is built, the backward pass is determined. I don't quite understand why there are still some differences in backward pass.

from synchronized-batchnorm-pytorch.

vacancy commented on August 18, 2024

The built-in BatchNorm fuses all backward operations in a single function. Theoretically, the output should be the same as following the computation graph. However, in practice, they are different because of numerical precision issues.

from synchronized-batchnorm-pytorch.

louyj136 commented on August 18, 2024

Thanks for guiding me！

from synchronized-batchnorm-pytorch.

Recommend Projects

a question about the highlight "use sqrt(max(var, eps)) instead of sqrt(var + eps)" about synchronized-batchnorm-pytorch HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent