snowflakedb / ArcticTrainingLinks
ArcticTraining is a framework designed to simplify and accelerate the post-training process for large language models (LLMs)
β190Updated this week
Alternatives and similar repositories for ArcticTraining
Users that are interested in ArcticTraining are comparing it to the libraries listed below
Sorting:
- ArcticInference: vLLM plugin for high-throughput, low-latency inferenceβ203Updated this week
- π Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flashβ¦β258Updated last week
- β206Updated 5 months ago
- Cold Compress is a hackable, lightweight, and open-source toolkit for creating and benchmarking cache compression methods built on top ofβ¦β138Updated 11 months ago
- A scalable asynchronous reinforcement learning implementation with in-flight weight updates.β134Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMsβ265Updated 9 months ago
- Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024β323Updated 3 months ago
- Manage scalable open LLM inference endpoints in Slurm clustersβ268Updated last year
- An extension of the nanoGPT repository for training small MOE models.β164Updated 4 months ago
- Load compute kernels from the Hubβ220Updated this week
- PyTorch building blocks for the OLMo ecosystemβ269Updated this week
- KV cache compression for high-throughput LLM inferenceβ134Updated 6 months ago
- Accelerating your LLM training to full speed! Made with β€οΈ by ServiceNow Researchβ217Updated this week
- BABILong is a benchmark for LLM evaluation using the needle-in-a-haystack approach.β208Updated 3 months ago
- β215Updated 6 months ago
- Efficient LLM Inference over Long Sequencesβ385Updated last month
- The source code of our work "Prepacking: A Simple Method for Fast Prefilling and Increased Throughput in Large Language Models" [AISTATS β¦β61Updated 9 months ago
- Benchmark suite for LLMs from Fireworks.aiβ76Updated this week
- Scalable toolkit for efficient model reinforcementβ558Updated this week
- Simple extension on vLLM to help you speed up reasoning model without training.β172Updated 2 months ago
- Boosting 4-bit inference kernels with 2:4 Sparsityβ80Updated 11 months ago
- Simple and efficient DeepSeek V3 SFT using pipeline parallel and expert parallel, with both FP8 and BF16 trainingsβ68Updated last week
- Easy and Efficient Quantization for Transformersβ198Updated last month
- [ICLR2025] Breaking Throughput-Latency Trade-off for Long Sequences with Speculative Decodingβ123Updated 8 months ago
- LLM KV cache compression made easyβ566Updated this week
- Multipack distributed sampler for fast padding-free training of LLMsβ199Updated 11 months ago
- Storing long contexts in tiny caches with self-studyβ121Updated this week
- A project to improve skills of large language modelsβ501Updated this week
- Official PyTorch implementation for Hogwild! Inference: Parallel LLM Generation with a Concurrent Attention Cacheβ113Updated 3 weeks ago
- code for training & evaluating Contextual Document Embedding modelsβ196Updated 2 months ago