Skip to content
Snippets Groups Projects
Unverified Commit 56be3a00 authored by Alexandru-Mihai GHERGHESCU's avatar Alexandru-Mihai GHERGHESCU
Browse files

Move to HuggingFace tokenizers

Drop SentencePiece tokenizers, as HuggingFace's tokenizers has a much
nicer interface to work with, plus it's written in Rust, is
parallelizable, and has better integration with the whole ecosystem.

HuggingFace tokenizers should not affect performance at all.
parent ed936b00
No related branches found
No related tags found
1 merge request!25Re-factor optimus-prime code (optimus-prime v2)
Showing
with 1396407 additions and 10 deletions
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment