snowflakedb / ArcticTrainingLinks
ArcticTraining is a framework designed to simplify and accelerate the post-training process for large language models (LLMs)
β156Updated this week
Alternatives and similar repositories for ArcticTraining
Users that are interested in ArcticTraining are comparing it to the libraries listed below
Sorting:
- β173Updated this week
- π Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flashβ¦β255Updated this week
- Load compute kernels from the Hubβ203Updated this week
- β214Updated 5 months ago
- Cold Compress is a hackable, lightweight, and open-source toolkit for creating and benchmarking cache compression methods built on top ofβ¦β138Updated 11 months ago
- A scalable asynchronous reinforcement learning implementation with in-flight weight updates.β129Updated this week
- [ICLR2025] Breaking Throughput-Latency Trade-off for Long Sequences with Speculative Decodingβ121Updated 7 months ago
- Boosting 4-bit inference kernels with 2:4 Sparsityβ80Updated 10 months ago
- π Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.β205Updated this week
- The source code of our work "Prepacking: A Simple Method for Fast Prefilling and Increased Throughput in Large Language Models" [AISTATS β¦β59Updated 9 months ago
- Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024β318Updated 2 months ago
- Layer-Condensed KV cache w/ 10 times larger batch size, fewer params and less computation. Dramatic speed up with better task performanceβ¦β150Updated 3 months ago
- Efficient LLM Inference over Long Sequencesβ382Updated 2 weeks ago
- β198Updated 5 months ago
- Manage scalable open LLM inference endpoints in Slurm clustersβ265Updated last year
- PyTorch building blocks for the OLMo ecosystemβ258Updated this week
- Accelerating your LLM training to full speed! Made with β€οΈ by ServiceNow Researchβ211Updated this week
- A safetensors extension to efficiently store sparse quantized tensors on diskβ135Updated this week
- Simple extension on vLLM to help you speed up reasoning model without training.β166Updated last month
- A high-throughput and memory-efficient inference and serving engine for LLMsβ264Updated 9 months ago
- Scalable toolkit for efficient model reinforcementβ499Updated this week
- BABILong is a benchmark for LLM evaluation using the needle-in-a-haystack approach.β203Updated 2 months ago
- Easy and Efficient Quantization for Transformersβ198Updated 3 weeks ago
- The official repo for "LLoCo: Learning Long Contexts Offline"β117Updated last year
- KV cache compression for high-throughput LLM inferenceβ132Updated 5 months ago
- Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clustersβ128Updated 7 months ago
- Benchmark suite for LLMs from Fireworks.aiβ76Updated last week
- An extension of the nanoGPT repository for training small MOE models.β160Updated 4 months ago
- Code for studying the super weight in LLMβ113Updated 7 months ago
- β112Updated last year