Skip to content
Snippets Groups Projects
  • Alexandru-Mihai GHERGHESCU's avatar
    56be3a00
    Move to HuggingFace tokenizers · 56be3a00
    Alexandru-Mihai GHERGHESCU authored
    Drop SentencePiece tokenizers, as HuggingFace's tokenizers has a much
    nicer interface to work with, plus it's written in Rust, is
    parallelizable, and has better integration with the whole ecosystem.
    
    HuggingFace tokenizers should not affect performance at all.
    Move to HuggingFace tokenizers
    Alexandru-Mihai GHERGHESCU authored
    Drop SentencePiece tokenizers, as HuggingFace's tokenizers has a much
    nicer interface to work with, plus it's written in Rust, is
    parallelizable, and has better integration with the whole ecosystem.
    
    HuggingFace tokenizers should not affect performance at all.
Code owners
Assign users and groups as approvers for specific file changes. Learn more.