Ignore last batches when calculating final train loss (4ab91bcf) · Commits · NetSys / Optimus Prime

Unverified Commit 4ab91bcf authored 1 year ago by Alexandru-Mihai GHERGHESCU

Ignore last batches when calculating final train loss

Visual change. This only changes what the trainer reports as the final
training loss.

Not quite sure if the value before was accurate anyway, since gradient
accumulation would not let the optimizer step every batch anyway.

For a big enough dataset, this should not have any impact at all.

The final loss value will be reported based on the last calculation of
the loss, correctly taking into consideration gradient accumulation as
well.

parent a092db0a

No related branches found

No related tags found

Hide whitespace changes

Inline Side-by-side

Showing with 0 additions and 4 deletions

Please register or to comment