Snippets Groups Projects

Out-of-Memory from context length caused by the KV cache?

Aisde from the model, we also use [2 * [bytes per token] * [layers] * [heads] * [dimensions per head] * [context length (num_tokens)] for the KV cache.

Designs

Child items ...

Activity

Alexandru-Mihai GHERGHESCU @agherghescu2411 · 1 year ago

Owner

Related article.
Alexandru-Mihai GHERGHESCU closed 1 year ago

closed

Please register or sign in to reply

Epic

None

Labels

None

Milestone

None

Iteration

None

Weight

None

Due date

None

Health status

None

Confidentiality

Confidentiality controls have moved to the issue actions menu () at the top of the page.

0 Participants