NVIDIA-NeMo / RunLinks
A tool to configure, launch and manage your machine learning experiments.
β208Updated last week
Alternatives and similar repositories for Run
Users that are interested in Run are comparing it to the libraries listed below
Sorting:
- π Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flashβ¦β271Updated last week
- Pytorch Distributed native training library for LLMs/VLMs with OOTB Hugging Face supportβ194Updated this week
- Load compute kernels from the Hubβ337Updated last week
- Scalable and Performant Data Loadingβ349Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMsβ267Updated last year
- Google TPU optimizations for transformers modelsβ123Updated 10 months ago
- Easy and lightning fast training of π€ Transformers on Habana Gaudi processor (HPU)β201Updated this week
- Fault tolerance for PyTorch (HSDP, LocalSGD, DiLoCo, Streaming DiLoCo)β454Updated 3 weeks ago
- Efficient LLM Inference over Long Sequencesβ392Updated 5 months ago
- β219Updated 10 months ago
- π Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.β217Updated last week
- Accelerating your LLM training to full speed! Made with β€οΈ by ServiceNow Researchβ262Updated last week
- A family of compressed models obtained via pruning and knowledge distillationβ357Updated last month
- ποΈ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of Oβ¦β320Updated 2 months ago
- Easy and Efficient Quantization for Transformersβ203Updated 5 months ago
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.β202Updated last year
- PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"β78Updated 2 months ago
- ArcticTraining is a framework designed to simplify and accelerate the post-training process for large language models (LLMs)β257Updated this week
- This repository contains the experimental PyTorch native float8 training UXβ226Updated last year
- Megatron's multi-modal data loaderβ280Updated this week
- ArcticInference: vLLM plugin for high-throughput, low-latency inferenceβ327Updated this week
- π· Build compute kernelsβ190Updated this week
- Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024β349Updated 7 months ago
- FlexAttention based, minimal vllm-style inference engine for fast Gemma 2 inference.β313Updated last month
- Provides end-to-end model development pipelines for LLMs and Multimodal models that can be launched on-prem or cloud-native.β509Updated 7 months ago
- β267Updated last week
- A safetensors extension to efficiently store sparse quantized tensors on diskβ214Updated this week
- PyTorch-native post-training at scaleβ549Updated last week
- Where GPUs get cooked π©βπ³π₯β319Updated 2 months ago
- β317Updated last week