gautierdag / tokenizer-bench

Code for the paper "Getting the most out of your tokenizer for pre-training and domain adaptation"
13Updated 9 months ago

Related projects

Alternatives and complementary repositories for tokenizer-bench