bgub / tokka-benchLinks
benchmarks for LLM tokenizers
☆14Updated last month
Alternatives and similar repositories for tokka-bench
Users that are interested in tokka-bench are comparing it to the libraries listed below
Sorting:
- Check for data drift between two OpenAI multi-turn chat jsonl files.☆38Updated last year
- Tokenization across languages. Useful as preprocessing for subword tokenization.☆22Updated 2 years ago
- ☆43Updated 2 years ago
- NLP with Rust for Python 🦀🐍☆65Updated 5 months ago
- 🤗 HuggingFace Inference Toolkit for Google Cloud Vertex AI (similar to SageMaker's Inference Toolkit, but for Vertex AI and unofficial)☆17Updated last year
- 📝 Reference-Free automatic summarization evaluation with potential hallucination detection☆102Updated last year
- A webhook that integrates the W&B model registry with Modal Labs☆15Updated last year
- Datamodels for hugging face tokenizers☆77Updated 3 weeks ago
- Pre-train Static Word Embeddings☆87Updated last month
- ☆49Updated 8 months ago
- ☆16Updated 7 months ago
- Feste is a free and open-source framework allowing scalable composition of NLP tasks using a graph execution model that is optimized and …☆42Updated 2 years ago
- ☆30Updated 3 years ago
- Framework for building and maintaining self-updating prompts for LLMs☆64Updated last year
- Pipeline components that support partial_fit.☆46Updated last year
- Chunk your text using gpt4o-mini more accurately☆44Updated last year
- Gzip and nearest neighbors for text classification☆57Updated 2 years ago
- Writing Blog Posts with Generative Feedback Loops!☆50Updated last year
- Official Implementation of the 'When XGBoost Outperforms GPT-4 on Text Classification: A Case Study' NAACL-W 2024 paper☆16Updated 10 months ago
- Multilingual Entity Linking model by BELA model☆12Updated 2 years ago
- obliquetree is an advanced decision tree implementation featuring oblique and axis-aligned splits, optimized performance.☆21Updated 2 months ago
- Official Repository for LEURN: Learning Explainable Univariate Rules with Neural Networks☆34Updated last year
- Notebooks for training universal 0-shot classifiers on many different tasks☆136Updated 9 months ago
- Seemless interface of using PyTOrch distributed with Jupyter notebooks☆50Updated last month
- Our open source implementation of MiniLMv2 (https://aclanthology.org/2021.findings-acl.188)☆61Updated 2 years ago
- Tools to make language models a bit easier to use☆54Updated 3 weeks ago
- A pytest plugin to organize and track algorithm visualizations☆17Updated 10 months ago
- Alternate Implementation for Zero Shot Text Classification: Instead of reframing NLI/XNLI, this reframes the text backbone of CLIP models…☆37Updated 3 years ago
- Library for fast text representation and classification.☆31Updated last year
- Use sync mode Playwright interactively, inside a Jupyter notebook☆15Updated 6 months ago