Skip to content
Snippets Groups Projects
Unverified Commit 8579fc15 authored by Alexandru-Mihai GHERGHESCU's avatar Alexandru-Mihai GHERGHESCU
Browse files

Adjust optimizer epsilon value for AMP

Pick a better default as epsilon value. Although this value should never
touch the fp16 gradients in mixed precision training (as the optimizer
should only ever work on the master fp32 copy of the model), this value
didn't need to be changed. However, in pure fp16 training, any epsilon
value lower than 1e-7 would simply underflow to 0, causing it to become
useless.

Although the framework doesn't directly support the second case above,
an epsilon value of 1e-7 seems like a better default for both AMP and
normal training.
parent 6db26eb1
No related branches found
No related tags found
1 merge request!17Add fp16 mixed precision training
Pipeline #55059 passed
......@@ -89,9 +89,14 @@ def main(batch_size: int = 8,
_total_params = sum(p.numel() for p in model.parameters())
print(f"Number of model parameters: {_total_params}")
# define loss metric and optimizer
# define loss metric
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), betas=(0.9, 0.999), eps=1e-9)
# define optimizer
# see [1] for a discussion on what the epsilon value should be for amp; 1e-7
# is a good default for both amp and normal training
# [1]: https://github.com/pytorch/pytorch/issues/26218
optimizer = torch.optim.Adam(model.parameters(), betas=(0.9, 0.999), eps=1e-7)
print("Starting training...")
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment