ISEEKYAN / mbridgeLinks
☆33Updated this week
Alternatives and similar repositories for mbridge
Users that are interested in mbridge are comparing it to the libraries listed below
Sorting:
- Official Implementation of SAM-Decoding: Speculative Decoding via Suffix Automaton☆28Updated 4 months ago
- Odysseus: Playground of LLM Sequence Parallelism☆70Updated last year
- Accelerate LLM preference tuning via prefix sharing with a single line of code☆41Updated last month
- ☆77Updated 2 months ago
- [NeurIPS 2024] Fast Best-of-N Decoding via Speculative Rejection☆45Updated 7 months ago
- ☆30Updated last month
- The official implementation of paper: SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction.☆46Updated 8 months ago
- ☆47Updated 2 weeks ago
- [NeurIPS 2024] Efficient LLM Scheduling by Learning to Rank☆48Updated 7 months ago
- ☆33Updated 9 months ago
- Nano repo for RL training of LLMs☆61Updated 2 weeks ago
- LongSpec: Long-Context Lossless Speculative Decoding with Efficient Drafting and Verification☆54Updated 3 months ago
- ☆104Updated 2 weeks ago
- [ICLR 2025] PEARL: Parallel Speculative Decoding with Adaptive Draft Length☆90Updated 2 months ago
- ☆53Updated last week
- A lightweight reinforcement learning framework that integrates seamlessly into your codebase, empowering developers to focus on algorithm…☆32Updated last month
- Ouroboros: Speculative Decoding with Large Model Enhanced Drafting (EMNLP 2024 main)☆106Updated 3 months ago
- Estimate MFU for DeepSeekV3☆24Updated 5 months ago
- ☆114Updated 3 weeks ago
- Async pipelined version of Verl☆100Updated 2 months ago
- Due to the huge vocaburary size (151,936) of Qwen models, the Embedding and LM Head weights are excessively heavy. Therefore, this projec…☆22Updated 10 months ago
- A simple calculation for LLM MFU.☆38Updated 3 months ago
- ☆45Updated last year
- Vocabulary Parallelism☆19Updated 3 months ago
- Repository of LV-Eval Benchmark☆67Updated 9 months ago
- Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (…☆127Updated this week
- [ICLR 2025] TidalDecode: A Fast and Accurate LLM Decoding with Position Persistent Sparse Attention☆39Updated 2 months ago
- ☆63Updated 7 months ago
- A scalable asynchronous reinforcement learning implementation with in-flight weight updates.☆125Updated this week
- Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"☆189Updated 3 months ago