Vocabulary Parallelism
☆25Mar 10, 2025Updated last year
Alternatives and similar repositories for VocabularyParallelism
Users that are interested in VocabularyParallelism are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Sequence-level 1F1B schedule for LLMs.☆38Aug 26, 2025Updated 7 months ago
- Zero Bubble Pipeline Parallelism☆452May 7, 2025Updated 11 months ago
- My notes for reading leveldb☆11Apr 19, 2024Updated last year
- An example showing how to use jax to train resnet50 on multi-node multi-GPU☆20Jul 4, 2022Updated 3 years ago
- Less Is More: Training-Free Sparse Attention with Global Locality for Efficient Reasoning☆31Sep 12, 2025Updated 7 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- MIT 6.824 2020☆10Mar 31, 2021Updated 5 years ago
- PipeRAG: Fast Retrieval-Augmented Generation via Algorithm-System Co-design (KDD 2025)☆31Jun 14, 2024Updated last year
- ☆13Jun 29, 2024Updated last year
- Official code repository of Shuffle-R1☆25Feb 23, 2026Updated last month
- MUA-RL: MULTI-TURN USER-INTERACTING AGENT REINFORCEMENT LEARNING FOR AGENTIC TOOL USE☆59Nov 5, 2025Updated 5 months ago
- Symphony — A decentralized multi-agent framework that enables intelligent agents to collaborate seamlessly across heterogeneous edge devi…☆33Oct 30, 2025Updated 5 months ago
- Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization☆82Dec 25, 2025Updated 3 months ago
- ☆38Jan 10, 2026Updated 3 months ago
- Reading seminar in Harvard Cloud Networking and Systems Group☆16Aug 29, 2022Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Extended Few-Shot Learning: Exploiting Existing Resources for Novel Tasks☆10Jul 6, 2021Updated 4 years ago
- ☆15Nov 5, 2024Updated last year
- [ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models☆11Dec 13, 2023Updated 2 years ago
- Unofficial wheels for some machine-learning Python libraries, for the Nvidia Jetson Nano.☆17Aug 24, 2021Updated 4 years ago
- ☆16Jul 23, 2024Updated last year
- Optimizing Anytime Reasoning via Budget Relative Policy Optimization☆54Jul 15, 2025Updated 8 months ago
- Hydragen: High-Throughput LLM Inference with Shared Prefixes☆50May 10, 2024Updated last year
- PyTorch bindings for CUTLASS grouped GEMM.☆184Feb 19, 2026Updated last month
- ☆22Apr 17, 2025Updated 11 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆35Jan 30, 2026Updated 2 months ago
- This repository contains the implementation of the paper: "Span Classification with Structured Information for Disfluency Detection in Sp…☆15Jun 6, 2023Updated 2 years ago
- ☆16Jan 14, 2025Updated last year
- ☆16Jul 12, 2024Updated last year
- PyTorch bindings for CUTLASS grouped GEMM.☆148May 29, 2025Updated 10 months ago
- A scalable automated alignment method for large language models. Resources for "Aligning Large Language Models via Self-Steering Optimiza…☆20Nov 21, 2024Updated last year
- [ICLR 2026]QeRL enables RL for 32B LLMs on a single H100 GPU.☆496Mar 30, 2026Updated last week
- [NeurIPS 2025] ScaleKV: Memory-Efficient Visual Autoregressive Modeling with Scale-Aware KV Cache Compression☆50Mar 13, 2026Updated 3 weeks ago
- [ACL 2026] A Multi-Dimensional Constraint Framework for Evaluating and Improving Instruction Following in Large Language Models☆21Updated this week
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆12Apr 9, 2025Updated last year
- ☆14Jan 20, 2025Updated last year
- Official repository for DistFlashAttn: Distributed Memory-efficient Attention for Long-context LLMs Training☆222Aug 19, 2024Updated last year
- [ACL 2025 (Findings)] DEMO: Reframing Dialogue Interaction with Fine-grained Element Modeling☆22Dec 16, 2024Updated last year
- ☆26Updated this week
- ☆28Jun 1, 2021Updated 4 years ago
- ☆15Jan 27, 2026Updated 2 months ago