Skip to content
Snippets Groups Projects
  1. Dec 03, 2023
  2. Nov 29, 2023
  3. Nov 14, 2023
  4. Nov 13, 2023
  5. Nov 10, 2023
  6. Nov 08, 2023
  7. Nov 03, 2023
    • flu0r1ne's avatar
      Fix key-value caching for seqlen != 1 · e9077bd2
      flu0r1ne authored
      This commit fixes a bug in the key-value caching. Currently,
      a square attention mask is misapplied to the scores matrix
      despite not matching the shape of the scores matrix. This
      results in a runtime error. In a correct implementation, the
      decoder mask needs to describe how the new seq_len tokens
      interact with all the cached tokens. That is, the attention
      mask needs to be of shape (seq_len, total_len), indicating how
      the token at row i (representing token i + cached_len in the
      transformer model) attends to token j. Accordingly, the matrix
      needs to mask entries where j > cached_len + i. This patch
      horizontally appends (seq_len, cached_len) zeros to an
      upper-triangular mask of size (seq_len, seq_len) to form the
      (seq_len, total_len) mask.
      e9077bd2
  8. Nov 02, 2023
  9. Oct 18, 2023
  10. Oct 16, 2023
  11. Oct 15, 2023
  12. Oct 11, 2023
  13. Sep 29, 2023
  14. Sep 26, 2023
  15. Sep 23, 2023
  16. Sep 21, 2023
Loading