Skip to content
Snippets Groups Projects

Gradient accumulation

Merged Alexandru-Mihai GHERGHESCU requested to merge feature/grad_acc into main

Implements gradient accumulation (this should help when training with more data, since batch size isn't limited by GPU memory anymore; training time should still be the same or slightly slower).

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
Please register or sign in to reply
Loading