NVIDIA-NeMo / RunLinks
A tool to configure, launch and manage your machine learning experiments.
β169Updated this week
Alternatives and similar repositories for Run
Users that are interested in Run are comparing it to the libraries listed below
Sorting:
- π Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flashβ¦β255Updated this week
- Fault tolerance for PyTorch (HSDP, LocalSGD, DiLoCo, Streaming DiLoCo)β359Updated last week
- Load compute kernels from the Hubβ203Updated this week
- Google TPU optimizations for transformers modelsβ114Updated 5 months ago
- Scalable toolkit for efficient model reinforcementβ499Updated this week
- ArcticTraining is a framework designed to simplify and accelerate the post-training process for large language models (LLMs)β156Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMsβ264Updated 9 months ago
- Scalable and Performant Data Loadingβ288Updated this week
- π Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.β205Updated this week
- β214Updated 5 months ago
- PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"β64Updated 3 months ago
- Provides end-to-end model development pipelines for LLMs and Multimodal models that can be launched on-prem or cloud-native.β505Updated 2 months ago
- Efficient LLM Inference over Long Sequencesβ382Updated 2 weeks ago
- Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024β318Updated 2 months ago
- A family of compressed models obtained via pruning and knowledge distillationβ343Updated 8 months ago
- β173Updated this week
- This repository contains the experimental PyTorch native float8 training UXβ224Updated 11 months ago
- Easy and Efficient Quantization for Transformersβ198Updated 2 weeks ago
- Easy and lightning fast training of π€ Transformers on Habana Gaudi processor (HPU)β190Updated this week
- Accelerating your LLM training to full speed! Made with β€οΈ by ServiceNow Researchβ211Updated this week
- β142Updated this week
- JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welβ¦β354Updated last month
- β271Updated last month
- ποΈ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of Oβ¦β305Updated last month
- PyTorch Single Controllerβ318Updated this week
- A safetensors extension to efficiently store sparse quantized tensors on diskβ135Updated this week
- β198Updated 5 months ago
- Megatron's multi-modal data loaderβ219Updated this week
- LLM KV cache compression made easyβ535Updated this week
- β161Updated last year