gautierdag / tokenizer-bench

Code for the paper "Getting the most out of your tokenizer for pre-training and domain adaptation"
10Updated 7 months ago

Related projects: