linkedin / Liger-Kernel
Efficient Triton Kernels for LLM Training
☆3,454Updated this week
Related projects ⓘ
Alternatives and complementary repositories for Liger-Kernel
- A native PyTorch Library for large model training☆2,623Updated this week
- Implementation for MatMul-free LM.☆2,920Updated 2 weeks ago
- PyTorch native quantization and sparsity for training and inference☆1,585Updated this week
- PyTorch native finetuning library☆4,336Updated this week
- SGLang is a fast serving framework for large language models and vision language models.☆6,127Updated this week
- nanoGPT style version of Llama 3.1☆1,246Updated 3 months ago
- NanoGPT (124M) quality in 7.8 8xH100-minutes☆1,033Updated this week
- Run PyTorch LLMs locally on servers, desktop and mobile☆3,383Updated this week
- Minimalistic large language model 3D-parallelism training☆1,260Updated this week
- Lightning-fast serving engine for any AI model of any size. Flexible. Easy. Enterprise-scale.☆2,489Updated this week
- Tile primitives for speedy kernels☆1,658Updated this week
- Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.☆2,045Updated this week
- Tools for merging pretrained large language models.☆4,816Updated 2 weeks ago
- ☆2,746Updated 2 months ago
- Composable building blocks to build Llama Apps☆4,594Updated this week
- Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs☆2,205Updated this week
- GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection☆1,435Updated 3 weeks ago
- Make PyTorch models up to 40% faster! Thunder is a source to source compiler for PyTorch. It enables using different hardware executors a…☆1,199Updated this week
- High-quality datasets, tools, and concepts for LLM fine-tuning.☆2,010Updated 3 weeks ago
- Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.☆4,227Updated last week
- ReFT: Representation Finetuning for Language Models☆1,159Updated 2 weeks ago
- FlashInfer: Kernel Library for LLM Serving☆1,452Updated this week
- [MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration☆2,526Updated last month
- Puzzles for learning Triton☆1,135Updated this week
- UNet diffusion model in pure CUDA☆584Updated 4 months ago
- A framework for serving and evaluating LLM routers - save LLM costs without compromising quality!☆3,256Updated 3 months ago
- Entropy Based Sampling and Parallel CoT Decoding☆3,036Updated last week
- MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.☆1,155Updated 2 weeks ago
- Scalable data pre processing and curation toolkit for LLMs☆615Updated this week
- Training LLMs with QLoRA + FSDP☆1,418Updated last week