Shaping capabilities with token-level pretraining data filtering
☆85Jan 28, 2026Updated last month
Alternatives and similar repositories for token-filtering
Users that are interested in token-filtering are comparing it to the libraries listed below
Sorting:
- implementation of https://arxiv.org/pdf/2312.09299☆21Jul 3, 2024Updated last year
- A tiny package supporting distributed computation of COCO metrics for PyTorch models.☆15Feb 28, 2023Updated 3 years ago
- Does patch ordering affect context-limited vision transformers?☆17Oct 10, 2025Updated 5 months ago
- Implementation of the model: "Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models" in PyTorch☆28Mar 9, 2026Updated last week
- A toy text-to-image model trained from scratch.☆19Jun 9, 2025Updated 9 months ago
- Implementation of numerous Vision Transformers in Google's JAX and Flax.☆22Aug 30, 2022Updated 3 years ago
- 🌾 Universal, customizable and deployable fine-grained evaluation for text generation.☆24Oct 26, 2023Updated 2 years ago
- Research work aimed at addressing the problem of modeling infinite-length context☆48Dec 18, 2025Updated 3 months ago
- Simple and scalable tools for data-driven pretraining data selection.☆29Jun 9, 2025Updated 9 months ago
- Code for the paper "Large Language Models Share Representations of Latent Grammatical Concepts Across Typologically Diverse Languages" (N…☆17Apr 13, 2025Updated 11 months ago
- Materials for EACL2024 tutorial: Transformer-specific Interpretability☆63Mar 26, 2024Updated last year
- ☆22Dec 3, 2021Updated 4 years ago
- MoE training for Me and You and maybe other people☆375Updated this week
- ☆14Jun 24, 2024Updated last year
- Demos of ChatGPT's function calling/structured data support.☆24Dec 21, 2023Updated 2 years ago
- QLoRA: Efficient Finetuning of Quantized LLMs☆11Jul 22, 2023Updated 2 years ago
- Codebase from our first release.☆48Feb 17, 2026Updated last month
- Mamba support for transformer lens☆19Sep 17, 2024Updated last year
- Fluent student-teacher redteaming☆23Jul 25, 2024Updated last year
- ☆40Jan 14, 2025Updated last year
- ☆22Apr 28, 2025Updated 10 months ago
- Database for International Physics Olympiads☆10Sep 22, 2025Updated 5 months ago
- Implementation of Direct Preference Optimization☆17Jul 17, 2023Updated 2 years ago
- Cross Atlas Remapping via Optimal Transport☆12Dec 14, 2023Updated 2 years ago
- entropix style sampling + GUI☆27Oct 30, 2024Updated last year
- coded with and corrected by Google Anti-Gravity☆13Nov 23, 2025Updated 3 months ago
- ☆13Mar 15, 2022Updated 4 years ago
- ☆16Jul 7, 2025Updated 8 months ago
- tuimorphic choose-your-own-adventure story game☆18Mar 3, 2026Updated 2 weeks ago
- ☆13Aug 20, 2021Updated 4 years ago
- LVAS-Agent Code Base☆22Apr 15, 2025Updated 11 months ago
- a simple variational auto encoder with some exploration☆12Nov 22, 2024Updated last year
- Decoupled Q-Chunking☆59Jan 10, 2026Updated 2 months ago
- [ICLR 2025] On Evluating the Durability of Safegurads for Open-Weight LLMs☆13Jun 20, 2025Updated 9 months ago
- ☆17Mar 4, 2025Updated last year
- ☆15Oct 31, 2023Updated 2 years ago
- ☆34Nov 11, 2025Updated 4 months ago
- ☆21Mar 18, 2025Updated last year
- Materials for "Prompting is not a substitute for probability measurements in large language models" (EMNLP 2023)☆24Oct 24, 2023Updated 2 years ago