MoonshotAI / checkpoint-engineLinks

Checkpoint-engine is a simple middleware to update model weights in LLM inference engines

☆851

Alternatives and similar repositories for checkpoint-engine

Users that are interested in checkpoint-engine are comparing it to the libraries listed below

Sorting:

deepseek-ai / LPLB
An early research stage MoE load balancer based on inear programming.
☆415Updated 2 weeks ago
thinking-machines-lab / batch_invariant_ops
☆917Updated last month
radixark / miles
☆317Updated this week
NVIDIA-NeMo / RL
Scalable toolkit for efficient model reinforcement
☆1,048Updated last week
snowflakedb / ArcticInference
ArcticInference: vLLM plugin for high-throughput, low-latency inference
☆327Updated this week
meta-pytorch / torchforge
PyTorch-native post-training at scale
☆549Updated last week
ByteDance-Seed / ByteCheckpoint
ByteCheckpoint: An Unified Checkpointing Library for LFMs
☆254Updated this week
NVIDIA / kvpress
LLM KV cache compression made easy
☆701Updated this week
PrimeIntellect-ai / prime-rl
Async RL Training at Scale
☆867Updated this week
QwenLM / ParScale
Parallel Scaling Law for Language Model — Beyond Parameter and Inference Time Scaling
☆456Updated 6 months ago
MoonshotAI / Kimi-Linear
☆1,215Updated 2 weeks ago
sgl-project / SpecForge
Train speculative decoding models effortlessly and port them smoothly to SGLang serving.
☆498Updated last week
snowflakedb / ArcticTraining
ArcticTraining is a framework designed to simplify and accelerate the post-training process for large language models (LLMs)
☆254Updated last week
perplexityai / pplx-kernels
Perplexity GPU Kernels
☆534Updated 3 weeks ago
NovaSky-AI / SkyRL
SkyRL: A Modular Full-stack RL Library for LLMs
☆1,287Updated last week
deepseek-ai / DeepSeek-V3.2-Exp
☆1,242Updated 2 weeks ago
NVIDIA / Star-Attention
Efficient LLM Inference over Long Sequences
☆392Updated 5 months ago
NVIDIA-NeMo / Automodel
Pytorch Distributed native training library for LLMs/VLMs with OOTB Hugging Face support
☆187Updated last week
sgl-project / genai-bench
Genai-bench is a powerful benchmark tool designed for comprehensive token-level performance evaluation of large language model (LLM) serv…
☆234Updated this week
sgl-project / sgl-learning-materials
Materials for learning SGLang
☆658Updated last week
tile-ai / TileRT
Tile-Based Runtime for Ultra-Low-Latency LLM Inference
☆261Updated last week
NVIDIA-NeMo / Megatron-Bridge
HuggingFace conversion and training library for Megatron-based models
☆228Updated this week
ovg-project / kvcached
Virtualized Elastic KV Cache for Dynamic GPU Sharing and Beyond
☆691Updated this week
vllm-project / vllm-omni
A framework for efficient model inference with omni-modality models
☆466Updated this week
ScalingIntelligence / KernelBench
KernelBench: Can LLMs Write GPU Kernels? - Benchmark with Torch -> CUDA (+ more DSLs)
☆683Updated this week
stepfun-ai / Step3
☆439Updated 3 months ago
meta-pytorch / torchcomms
torchcomms: a modern PyTorch communications API
☆295Updated last week
changjonathanc / flex-nano-vllm
FlexAttention based, minimal vllm-style inference engine for fast Gemma 2 inference.
☆313Updated last month
alibaba / ROCK
A construction kit for reinforcement learning environment management.
☆226Updated this week
mit-han-lab / duo-attention
[ICLR 2025] DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
☆507Updated 9 months ago