[EMNLP 2022/2023] Fast Vocabulary Transfer & Multi-word Tokenization
☆27Jan 19, 2025Updated last year
Alternatives and similar repositories for fast-vocabulary-transfer
Users that are interested in fast-vocabulary-transfer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Show the time in Roman Numerals☆11Jan 23, 2020Updated 6 years ago
- Code for the paper "Getting the most out of your tokenizer for pre-training and domain adaptation"☆22Feb 14, 2024Updated 2 years ago
- CSE201 Objected-Oriented Programming in C++: Teach an AI to produce pieces of music☆12Jan 23, 2019Updated 7 years ago
- This repository contains code used for our Multi Sentence Inference NAACL'22 paper.☆12Mar 6, 2023Updated 3 years ago
- [ACL'25 (Findings)] Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Multimodal Web Agents☆28Feb 17, 2026Updated 2 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Merge multi-track MIDI sequence into a single track for further processing☆12Nov 4, 2020Updated 5 years ago
- Perf monitoring CLI tool for Apple Silicon☆10Jan 25, 2023Updated 3 years ago
- A set of utilities to turn Dataclasses into useful configuration managers.☆11Mar 27, 2024Updated 2 years ago
- ☆16Mar 4, 2024Updated 2 years ago
- Get up in the morning by striking a pose to stop your alarm from ringing.☆12Jun 9, 2021Updated 4 years ago
- Code for Zero-Shot Tokenizer Transfer☆144Jan 14, 2025Updated last year
- This is the repo for constructing a comprehensive and rigorous evaluation framework for LLM calibration.☆13Apr 9, 2024Updated 2 years ago
- PathPiece tokenizer☆14Nov 10, 2024Updated last year
- Code from the CMU LM inference fall 2025 edition.☆36Dec 7, 2025Updated 4 months ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- CaMML:Context-Aware MultiModal Learner for Large Models (ACL 2024 SAC Award)☆15May 21, 2025Updated 10 months ago
- ☆10Jul 13, 2018Updated 7 years ago
- CUDA keyring packaging for Debian☆14Apr 14, 2023Updated 3 years ago
- Code and data from the paper 'Human Feedback is not Gold Standard'☆20Apr 1, 2026Updated 2 weeks ago
- C# bindings for llama.cpp for Unity☆15Dec 13, 2024Updated last year
- A short react project to refresh myself on the best practices of api's when used with react☆11Jan 4, 2023Updated 3 years ago
- [ICLR 2024] "Data Distillation Can Be Like Vodka: Distilling More Times For Better Quality" by Xuxi Chen*, Yu Yang*, Zhangyang Wang, Baha…☆15May 18, 2024Updated last year
- ☆25Mar 30, 2026Updated 3 weeks ago
- Font style transfer for Devanāgarī script using GANs☆12Jun 25, 2022Updated 3 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Fairer Preferences Elicit Improved Human-Aligned Large Language Model Judgments (Zhou et al., EMNLP 2024)☆14Oct 3, 2024Updated last year
- PyTorch implementation of the original evidental-deep-learning@https://github.com/aamini/evidential-deep-learning/☆13Sep 20, 2021Updated 4 years ago
- Coala is a python package for Contextual Answer Sentence Selection.☆15Jun 12, 2023Updated 2 years ago
- Official Code Repository for [AutoScale📈: Scale-Aware Data Mixing for Pre-Training LLMs] Published as a conference paper at **COLM 2025*…☆13Aug 8, 2025Updated 8 months ago
- The elegant integration of huggingface/nlp and fastai2 and handy transforms using pure huggingface/nlp☆19Oct 6, 2020Updated 5 years ago
- Android web based memory scanner & editor.☆19Oct 3, 2023Updated 2 years ago
- ☆12Jun 30, 2024Updated last year
- Code for the paper "REV: Information-Theoretic Evaluation of Free-Text Rationales"☆16Aug 11, 2023Updated 2 years ago
- ☆20Feb 2, 2026Updated 2 months ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- https://footprints.baulab.info☆18Oct 4, 2024Updated last year
- LEIA: Facilitating Cross-Lingual Knowledge Transfer in Language Models with Entity-based Data Augmentation☆23Apr 24, 2024Updated last year
- ☆10Feb 28, 2025Updated last year
- ☆15Jan 29, 2025Updated last year
- ☆15Mar 24, 2022Updated 4 years ago
- ☆17Mar 23, 2025Updated last year
- Anticancer Peptide Identification employing Multi-headed Deep-CNN☆13Nov 18, 2021Updated 4 years ago