Add fp16 mixed precision training (6db26eb1) · Commits · NetSys / Optimus Prime

Unverified Commit 6db26eb1 authored 1 year ago by Alexandru-Mihai GHERGHESCU

Add fp16 mixed precision training

This should give training a theoretical 2x speedup in time (though in
practice that's not usually the case), with close to no loss in
performance.

The interface allows the user to choose between mixed precision or no
mixed precision training, which falls back to normal float32 precision.

CPU support for training has been dropped, as it takes (with or without
mixed precision) much much longer to train than on GPU's, and it's not
really an alternative anyone considers. With the addition of mixed
precision, supporting both CPU and GPU would complicate things too much,
therefore CPU training support has been dropped.

parent fe76efab

No related branches found

No related tags found

1 merge request!17Add fp16 mixed precision training

Hide whitespace changes

Inline Side-by-side

Showing with 37 additions and 15 deletions

Please register or to comment