NVIDIA-NeMo / RunLinks
A tool to configure, launch and manage your machine learning experiments.
β161Updated this week
Alternatives and similar repositories for Run
Users that are interested in Run are comparing it to the libraries listed below
Sorting:
- π Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flashβ¦β253Updated this week
- Load compute kernels from the Hubβ172Updated this week
- This repository contains the experimental PyTorch native float8 training UXβ224Updated 10 months ago
- Scalable toolkit for efficient model reinforcementβ438Updated this week
- PyTorch per step fault tolerance (actively under development)β329Updated this week
- π Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.β204Updated this week
- PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"β61Updated 2 months ago
- Google TPU optimizations for transformers modelsβ113Updated 5 months ago
- NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the β¦β177Updated 2 weeks ago
- β212Updated 4 months ago
- Applied AI experiments and examples for PyTorchβ277Updated 3 weeks ago
- Easy and lightning fast training of π€ Transformers on Habana Gaudi processor (HPU)β188Updated this week
- A safetensors extension to efficiently store sparse quantized tensors on diskβ126Updated this week
- Fast low-bit matmul kernels in Tritonβ322Updated this week
- PyTorch Single Controllerβ218Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMsβ264Updated 8 months ago
- Scalable and Performant Data Loadingβ277Updated this week
- Triton-based implementation of Sparse Mixture of Experts.β219Updated 6 months ago
- β108Updated last year
- Megatron's multi-modal data loaderβ213Updated last week
- Cataloging released Triton kernels.β236Updated 5 months ago
- ArcticTraining is a framework designed to simplify and accelerate the post-training process for large language models (LLMs)β119Updated this week
- ring-attention experimentsβ144Updated 8 months ago
- Easy and Efficient Quantization for Transformersβ199Updated 4 months ago
- A curated collection of resources, tutorials, and best practices for learning and mastering NVIDIA CUTLASSβ186Updated last month
- A family of compressed models obtained via pruning and knowledge distillationβ343Updated 7 months ago
- Provides end-to-end model development pipelines for LLMs and Multimodal models that can be launched on-prem or cloud-native.β504Updated 2 months ago
- A project to improve skills of large language modelsβ423Updated this week
- A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mindβ¦β157Updated 6 months ago
- ποΈ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of Oβ¦β304Updated 3 weeks ago