lm-sys / lm-sys.github.ioLinks
The source of LMSYS website and blogs
☆77Updated this week
Alternatives and similar repositories for lm-sys.github.io
Users that are interested in lm-sys.github.io are comparing it to the libraries listed below
Sorting:
- Genai-bench is a powerful benchmark tool designed for comprehensive token-level performance evaluation of large language model (LLM) serv…☆263Updated this week
- Benchmark suite for LLMs from Fireworks.ai☆89Updated this week
- Pytorch Distributed native training library for LLMs/VLMs with OOTB Hugging Face support☆266Updated this week
- [ICLR2025] Breaking Throughput-Latency Trade-off for Long Sequences with Speculative Decoding☆141Updated last year
- Triton-based implementation of Sparse Mixture of Experts.☆263Updated 4 months ago
- ☆125Updated last year
- ByteCheckpoint: An Unified Checkpointing Library for LFMs☆268Updated last month
- 🚀 Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.☆219Updated this week
- ☆96Updated 10 months ago
- 🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash…☆279Updated 2 months ago
- Code for the paper "Rethinking Benchmark and Contamination for Language Models with Rephrased Samples"☆316Updated 2 years ago
- [ICML 2024] CLLMs: Consistency Large Language Models☆410Updated last year
- Official repository for DistFlashAttn: Distributed Memory-efficient Attention for Long-context LLMs Training☆222Updated last year
- Ship correct and fast LLM kernels to PyTorch☆140Updated 3 weeks ago
- ☆219Updated last year
- [NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization☆402Updated last year
- [NeurIPS 2025] Simple extension on vLLM to help you speed up reasoning model without training.☆218Updated 8 months ago
- A unified library for building, evaluating, and storing speculative decoding algorithms for LLM inference in vLLM☆220Updated this week
- Megatron's multi-modal data loader☆315Updated last week
- PyTorch-native post-training at scale☆605Updated last week
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆131Updated 4 months ago
- Cold Compress is a hackable, lightweight, and open-source toolkit for creating and benchmarking cache compression methods built on top of…☆148Updated last year
- Experiments on speculative sampling with Llama models☆127Updated 2 years ago
- JAX backend for SGL☆234Updated this week
- LM engine is a library for pretraining/finetuning LLMs☆113Updated last week
- Ouroboros: Speculative Decoding with Large Model Enhanced Drafting (EMNLP 2024 main)☆113Updated 10 months ago
- REST: Retrieval-Based Speculative Decoding, NAACL 2024☆215Updated 4 months ago
- Miles is an enterprise-facing reinforcement learning framework for LLM and VLM post-training, forked from and co-evolving with slime.☆830Updated this week
- Training library for Megatron-based models with bidirectional Hugging Face conversion capability☆400Updated last week
- Explorations into some recent techniques surrounding speculative decoding☆299Updated last year