Skip to content
Snippets Groups Projects
  1. Jan 24, 2024
    • Alexandru-Mihai GHERGHESCU's avatar
      Fix final training loss calculation, fix estimation interval · 64302265
      Alexandru-Mihai GHERGHESCU authored
      Visual change, correctly display final training loss.
      
      The final training loss didn't account for gradient accumulation, and
      was therefore much smaller than it should've been in reality.
      
      Fix the estimation interval, which was also not properly calculated due
      to gradient accumulation.
      64302265
    • Alexandru-Mihai GHERGHESCU's avatar
      Fix bad calculation for number of batches · 7aa99b4a
      Alexandru-Mihai GHERGHESCU authored
      There was a corner case when the shape of the predictions y of the
      dataset would not be correct, due to the fact that the number of batches
      was miscalculated.
      
      This happened when `batch_len` was exactly divisible by `seq_len`, since
      the predictions, which are simply the text shifted once to the right,
      would not have that extra column at the end.
      
      Fix the above issue by decrementing the number of available batches with
      1 when `batch_len` exactly divides by `seq_len`.
      7aa99b4a
    • Alexandru-Mihai GHERGHESCU's avatar
      Ignore last batches when calculating final train loss · 4ab91bcf
      Alexandru-Mihai GHERGHESCU authored
      Visual change. This only changes what the trainer reports as the final
      training loss.
      
      Not quite sure if the value before was accurate anyway, since gradient
      accumulation would not let the optimizer step every batch anyway.
      
      For a big enough dataset, this should not have any impact at all.
      
      The final loss value will be reported based on the last calculation of
      the loss, correctly taking into consideration gradient accumulation as
      well.
      4ab91bcf
  2. Jan 22, 2024
  3. Jan 18, 2024
  4. Jan 12, 2024
  5. Jan 11, 2024
  6. Jan 09, 2024
  7. Jan 06, 2024
  8. Jan 05, 2024
    • Alexandru-Mihai GHERGHESCU's avatar
      Fix some issues with the wikitext103 dataset · c9dd8feb
      Alexandru-Mihai GHERGHESCU authored
      Couple of things:
      - rewrite code to better check when the dataset is downloaded
      - better cleanup after download + unzip
      - more aggresive exit on checksum mismatch
      - rewrite __main__
      c9dd8feb
    • Alexandru-Mihai GHERGHESCU's avatar
      Fix a few issues with the TinyStories dataset file · 400d138a
      Alexandru-Mihai GHERGHESCU authored
      Couple of things, mostly for code consistency and clarity:
      - reorganize imports
      - reorganize initial global variables (URL, MD5 etc.)
      - rename class to contain "Dataset"
      - fix comments
      
      There are also a few things which I added / replaced / removed, upon
      re-consideration of how datasets should work:
      - add additional folder "tinystories" where to download the .txt files
      - remove the pandas DataFrame
      - rewrite __main__ example
      - be more aggresive when checksums for downloaded files don't match
      400d138a
  9. Jan 03, 2024
  10. Jan 02, 2024
  11. Dec 28, 2023
    • Alexandru-Mihai GHERGHESCU's avatar
      Add progress bar display for training · faecfbce
      Alexandru-Mihai GHERGHESCU authored
      Use fastai's fastprogress package to display a progress bar while
      training, with useful information such as loss, estimated time of
      training, current learning rate, estimated ms/batch.
      
      Print end of epoch stats when finishing an epoch.
      
      Add a relevant parameter for the trainer to enable/disable the progress
      bar display.
      faecfbce
  12. Dec 27, 2023
  13. Nov 24, 2023
Loading