[EMNLP 2022/2023] Fast Vocabulary Transfer & Multi-word Tokenization
☆27Jan 19, 2025Updated last year
Alternatives and similar repositories for fast-vocabulary-transfer
Users that are interested in fast-vocabulary-transfer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Neural Markov Logic Networks☆13Feb 14, 2022Updated 4 years ago
- PyC (Pytorch Concepts) is a PyTorch-based library for training concept-based interpretable deep learning models.☆34May 8, 2026Updated 2 weeks ago
- Codebase for VAEL: Bridging Variational Autoencoders and Probabilistic Logic Programming☆24Jun 30, 2023Updated 2 years ago
- Code for the paper "Getting the most out of your tokenizer for pre-training and domain adaptation"☆22Feb 14, 2024Updated 2 years ago
- Evaluate Transformers from the Hub 🔥☆14Apr 3, 2026Updated last month
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ☆38Jan 26, 2024Updated 2 years ago
- Defeasible Natural Language Inference☆13Dec 4, 2020Updated 5 years ago
- Perf monitoring CLI tool for Apple Silicon☆10Jan 25, 2023Updated 3 years ago
- ☆18Mar 26, 2022Updated 4 years ago
- A set of utilities to turn Dataclasses into useful configuration managers.☆11Mar 27, 2024Updated 2 years ago
- ☆16Mar 4, 2024Updated 2 years ago
- Python code and data for the post "Word Segmentation, or Makingsenseofthis"☆17Oct 24, 2022Updated 3 years ago
- Code for Zero-Shot Tokenizer Transfer☆144Jan 14, 2025Updated last year
- PathPiece tokenizer☆14Nov 10, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Code and data from the paper 'Human Feedback is not Gold Standard'☆20May 5, 2026Updated 2 weeks ago
- ☆14Apr 22, 2024Updated 2 years ago
- A short react project to refresh myself on the best practices of api's when used with react☆11Jan 4, 2023Updated 3 years ago
- [ICLR 2024] "Data Distillation Can Be Like Vodka: Distilling More Times For Better Quality" by Xuxi Chen*, Yu Yang*, Zhangyang Wang, Baha…☆15May 18, 2024Updated 2 years ago
- Repository for Teaching Broad Reasoning Skills for Multi-Step QA by Generating Hard Contexts, EMNLP22☆19Jun 23, 2023Updated 2 years ago
- Font style transfer for Devanāgarī script using GANs☆12Jun 25, 2022Updated 3 years ago
- Code for "Unlearning Traces the Influential Training Data of Language Models"☆13Jun 13, 2024Updated last year
- Code from the CMU LM inference fall 2025 edition.☆42Dec 7, 2025Updated 5 months ago
- PyTorch implementation of the original evidental-deep-learning@https://github.com/aamini/evidential-deep-learning/☆13Sep 20, 2021Updated 4 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Official Code Repository for [AutoScale📈: Scale-Aware Data Mixing for Pre-Training LLMs] Published as a conference paper at **COLM 2025*…☆14Aug 8, 2025Updated 9 months ago
- The elegant integration of huggingface/nlp and fastai2 and handy transforms using pure huggingface/nlp☆19Oct 6, 2020Updated 5 years ago
- ☆12Jun 30, 2024Updated last year
- ☆20Feb 2, 2026Updated 3 months ago
- https://footprints.baulab.info☆18Oct 4, 2024Updated last year
- ☆13Mar 5, 2024Updated 2 years ago
- TREC QA dataset for question answering cleaned for usage in Question Answering☆14Aug 26, 2019Updated 6 years ago
- ☆15Jan 29, 2025Updated last year
- PyPSDD porting to Python 3 + PyTorch equivalent tree construction.☆16Jun 7, 2023Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- View instruction register through GameGuardian !☆14Dec 8, 2019Updated 6 years ago
- ☆18Mar 23, 2025Updated last year
- Curated list of open source and openly accessible large language models☆26Jul 16, 2023Updated 2 years ago
- Firecracker VM orchestration for Claude Code sessions☆29Updated this week
- ☆15Dec 8, 2022Updated 3 years ago
- ☆11Feb 3, 2026Updated 3 months ago
- CLAIR: A (surprisingly) simple semantic text metric with large language models.☆22Jan 28, 2024Updated 2 years ago