[EMNLP 2022/2023] Fast Vocabulary Transfer & Multi-word Tokenization
☆27Jan 19, 2025Updated last year
Alternatives and similar repositories for fast-vocabulary-transfer
Users that are interested in fast-vocabulary-transfer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Accompanying code for "Analyzing Vision Tranformers in Class Embedding Space" (NeurIPS '23)☆16Jun 10, 2024Updated 2 years ago
- Show the time in Roman Numerals☆11Jan 23, 2020Updated 6 years ago
- Code for the paper "Getting the most out of your tokenizer for pre-training and domain adaptation"☆22Feb 14, 2024Updated 2 years ago
- This repository contains code used for our Multi Sentence Inference NAACL'22 paper.☆12Mar 6, 2023Updated 3 years ago
- Evaluate Transformers from the Hub 🔥☆14May 26, 2026Updated 2 weeks ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Perf monitoring CLI tool for Apple Silicon☆10Jan 25, 2023Updated 3 years ago
- ☆15Oct 4, 2024Updated last year
- Ongoing research training transformer models at scale☆18Jul 27, 2023Updated 2 years ago
- ☆18Mar 26, 2022Updated 4 years ago
- A set of utilities to turn Dataclasses into useful configuration managers.☆11Mar 27, 2024Updated 2 years ago
- Get up in the morning by striking a pose to stop your alarm from ringing.☆12Jun 9, 2021Updated 5 years ago
- Code for Zero-Shot Tokenizer Transfer☆145Jan 14, 2025Updated last year
- [NeurIPS 2024 D&B Track] DACO: Towards Application-Driven and Comprehensive Data Analysis via Code Generation☆13Mar 5, 2025Updated last year
- This is the repo for constructing a comprehensive and rigorous evaluation framework for LLM calibration.☆13Apr 9, 2024Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- PathPiece tokenizer☆14Nov 10, 2024Updated last year
- CaMML:Context-Aware MultiModal Learner for Large Models (ACL 2024 SAC Award)☆15May 21, 2025Updated last year
- CUDA keyring packaging for Debian☆14Apr 14, 2023Updated 3 years ago
- Code and data from the paper 'Human Feedback is not Gold Standard'☆21May 5, 2026Updated last month
- A short react project to refresh myself on the best practices of api's when used with react☆11Jan 4, 2023Updated 3 years ago
- [ICLR 2024] "Data Distillation Can Be Like Vodka: Distilling More Times For Better Quality" by Xuxi Chen*, Yu Yang*, Zhangyang Wang, Baha…☆15May 18, 2024Updated 2 years ago
- Font style transfer for Devanāgarī script using GANs☆13Jun 25, 2022Updated 3 years ago
- Diverse Demonstrations Improve In-context Compositional Generalization☆12Jul 7, 2023Updated 2 years ago
- Repository for Teaching Broad Reasoning Skills for Multi-Step QA by Generating Hard Contexts, EMNLP22☆19Jun 23, 2023Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- PyTorch implementation of the original evidental-deep-learning@https://github.com/aamini/evidential-deep-learning/☆13Sep 20, 2021Updated 4 years ago
- Coala is a python package for Contextual Answer Sentence Selection.☆15Jun 12, 2023Updated 3 years ago
- Official Code Repository for [AutoScale📈: Scale-Aware Data Mixing for Pre-Training LLMs] Published as a conference paper at **COLM 2025*…☆14Aug 8, 2025Updated 10 months ago
- Android web based memory scanner & editor.☆19Oct 3, 2023Updated 2 years ago
- ☆12Jun 30, 2024Updated last year
- Code for the paper "REV: Information-Theoretic Evaluation of Free-Text Rationales"☆16Aug 11, 2023Updated 2 years ago
- ☆20Feb 2, 2026Updated 4 months ago
- https://footprints.baulab.info☆18Oct 4, 2024Updated last year
- ☆16Jan 29, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆15Mar 24, 2022Updated 4 years ago
- View instruction register through GameGuardian !☆13Dec 8, 2019Updated 6 years ago
- ☆18Mar 23, 2025Updated last year
- Curated list of open source and openly accessible large language models☆26Jul 16, 2023Updated 2 years ago
- ☆10Dec 27, 2018Updated 7 years ago
- Coding-agent VM orchestrator: runs coding agents in isolated VMs — Firecracker micro-VMs on Linux (with ZFS-based audit-trail snapshots) …☆31Updated this week
- ☆15Dec 8, 2022Updated 3 years ago