radixark / milesLinks
☆564Updated this week
Alternatives and similar repositories for miles
Users that are interested in miles are comparing it to the libraries listed below
Sorting:
- Checkpoint-engine is a simple middleware to update model weights in LLM inference engines☆858Updated this week
- A scalable asynchronous reinforcement learning implementation with in-flight weight updates.☆322Updated last week
- Implementation for FP8/INT8 Rollout for RL training without performence drop.☆279Updated last month
- ByteCheckpoint: An Unified Checkpointing Library for LFMs☆256Updated this week
- ☆933Updated last month
- PyTorch-native post-training at scale☆559Updated this week
- ArcticTraining is a framework designed to simplify and accelerate the post-training process for large language models (LLMs)☆257Updated last week
- HuggingFace conversion and training library for Megatron-based models☆250Updated this week
- Memory optimized Mixture of Experts☆69Updated 4 months ago
- An early research stage expert-parallel load balancer for MoE models based on linear programming.☆456Updated 3 weeks ago
- Pytorch Distributed native training library for LLMs/VLMs with OOTB Hugging Face support☆202Updated this week
- Bridge Megatron-Core to Hugging Face/Reinforcement Learning☆168Updated last week
- [ICLR2025 Spotlight] MagicPIG: LSH Sampling for Efficient LLM Generation☆242Updated 11 months ago
- [ICML 2024] CLLMs: Consistency Large Language Models☆407Updated last year
- JAX backend for SGL☆191Updated this week
- Parallel Scaling Law for Language Model — Beyond Parameter and Inference Time Scaling☆462Updated 6 months ago
- ArcticInference: vLLM plugin for high-throughput, low-latency inference☆345Updated this week
- ☆224Updated 2 weeks ago
- Triton-based implementation of Sparse Mixture of Experts.☆253Updated 2 months ago
- ☆219Updated 10 months ago
- [NeurIPS 2025] Simple extension on vLLM to help you speed up reasoning model without training.☆212Updated 6 months ago
- KV cache compression for high-throughput LLM inference☆146Updated 10 months ago
- Genai-bench is a powerful benchmark tool designed for comprehensive token-level performance evaluation of large language model (LLM) serv…☆236Updated 2 weeks ago
- [NeurIPS 2025] Scaling Speculative Decoding with Lookahead Reasoning☆56Updated last month
- Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"☆249Updated 10 months ago
- KernelBench: Can LLMs Write GPU Kernels? - Benchmark with Torch -> CUDA (+ more DSLs)☆697Updated last week
- A Gym for Agentic LLMs☆395Updated last month
- 🔥 LLM-powered GPU kernel synthesis: Train models to convert PyTorch ops into optimized Triton kernels via SFT+RL. Multi-turn compilation…☆105Updated last month
- Physics of Language Models, Part 4☆265Updated this week
- LLM KV cache compression made easy☆709Updated this week