Repository of the technical report "On the Convergence of AdaBound and its Connection to SGD" [PDF], including the implementation for the proposed bias-corrected dampened form of momentum SGD (CSGD for short).
CSGD is offered as a stand-alone PyTorch module in csgd.py.
PyTorch == 1.1.0
The code should also work earlier versions of PyTorch (e.g. 0.4.0).
CSGD can be used like any of the PyTorch built-in optimizers, for example:
import CSGD from csgd
optimizer = CSGD(params, lr=0.1, momentum=0.9, weight_decay=5e-4)
Note that CSGD does not have 'nesterov' nor 'dampening' arguments (it uses standard heavy-ball momentum with dampening=momentum).
@article{savarese2019adaboundcsgd,
title={On the Convergence of AdaBound and its Connection to SGD},
author={Pedro Savarese},
journal={arXiv preprint arXiv:1908.04457},
year={2019}
}