NVIDIA-NeMo / RunLinks
A tool to configure, launch and manage your machine learning experiments.
β182Updated this week
Alternatives and similar repositories for Run
Users that are interested in Run are comparing it to the libraries listed below
Sorting:
- π Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flashβ¦β260Updated last month
- Fault tolerance for PyTorch (HSDP, LocalSGD, DiLoCo, Streaming DiLoCo)β383Updated last week
- Scalable and Performant Data Loadingβ291Updated last week
- Google TPU optimizations for transformers modelsβ118Updated 7 months ago
- Load compute kernels from the Hubβ244Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMsβ266Updated 10 months ago
- PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"β67Updated 4 months ago
- π Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.β208Updated last week
- β217Updated 7 months ago
- ArcticTraining is a framework designed to simplify and accelerate the post-training process for large language models (LLMs)β200Updated last week
- Easy and lightning fast training of π€ Transformers on Habana Gaudi processor (HPU)β192Updated this week
- This repository contains the experimental PyTorch native float8 training UXβ224Updated last year
- β289Updated 2 weeks ago
- Efficient LLM Inference over Long Sequencesβ389Updated last month
- Scalable toolkit for efficient model reinforcementβ626Updated this week
- Easy and Efficient Quantization for Transformersβ201Updated 2 months ago
- Fine-tune any Hugging Face LLM or VLM on day-0 using PyTorch-native features for GPU-accelerated distributed training with superior perfoβ¦β40Updated last week
- PyTorch Single Controllerβ361Updated last week
- β211Updated 6 months ago
- Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024β327Updated 3 months ago
- A safetensors extension to efficiently store sparse quantized tensors on diskβ149Updated last week
- ArcticInference: vLLM plugin for high-throughput, low-latency inferenceβ210Updated last week
- OpenAI compatible API for TensorRT LLM triton backendβ213Updated last year
- ποΈ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of Oβ¦β309Updated this week
- JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welβ¦β369Updated 2 months ago
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.β199Updated last year
- β118Updated last year
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, sparsβ¦β344Updated 8 months ago
- Megatron's multi-modal data loaderβ237Updated last week
- Provides end-to-end model development pipelines for LLMs and Multimodal models that can be launched on-prem or cloud-native.β507Updated 4 months ago