Commits on Source (2)
-
Alexandru-Mihai GHERGHESCU authored
This should give training a theoretical 2x speedup in time (though in practice that's not usually the case), with close to no loss in performance. The interface allows the user to choose between mixed precision or no mixed precision training, which falls back to normal float32 precision. CPU support for training has been dropped, as it takes (with or without mixed precision) much much longer to train than on GPU's, and it's not really an alternative anyone considers. With the addition of mixed precision, supporting both CPU and GPU would complicate things too much, therefore CPU training support has been dropped.
Unverified6db26eb1 -
Alexandru-Mihai GHERGHESCU authored
Pick a better default as epsilon value. Although this value should never touch the fp16 gradients in mixed precision training (as the optimizer should only ever work on the master fp32 copy of the model), this value didn't need to be changed. However, in pure fp16 training, any epsilon value lower than 1e-7 would simply underflow to 0, causing it to become useless. Although the framework doesn't directly support the second case above, an epsilon value of 1e-7 seems like a better default for both AMP and normal training.
Unverified8579fc15