gautierdag / tokenizer-bench

Code for the paper "Getting the most out of your tokenizer for pre-training and domain adaptation"
15Updated last year

Alternatives and similar repositories for tokenizer-bench:

Users that are interested in tokenizer-bench are comparing it to the libraries listed below