NVIDIA-NeMo / RunLinks
A tool to configure, launch and manage your machine learning experiments.
β205Updated this week
Alternatives and similar repositories for Run
Users that are interested in Run are comparing it to the libraries listed below
Sorting:
- π Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flashβ¦β271Updated last week
- Load compute kernels from the Hubβ326Updated this week
- Pytorch Distributed native training library for LLMs/VLMs with OOTB Hugging Face supportβ167Updated this week
- Fault tolerance for PyTorch (HSDP, LocalSGD, DiLoCo, Streaming DiLoCo)β446Updated last week
- Scalable and Performant Data Loadingβ335Updated this week
- β218Updated 9 months ago
- Google TPU optimizations for transformers modelsβ122Updated 9 months ago
- ArcticTraining is a framework designed to simplify and accelerate the post-training process for large language models (LLMs)β245Updated this week
- PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"β77Updated 2 months ago
- Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024β346Updated 6 months ago
- This repository contains the experimental PyTorch native float8 training UXβ223Updated last year
- ArcticInference: vLLM plugin for high-throughput, low-latency inferenceβ299Updated this week
- Accelerating your LLM training to full speed! Made with β€οΈ by ServiceNow Researchβ259Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMsβ266Updated last year
- Where GPUs get cooked π©βπ³π₯β310Updated last month
- A family of compressed models obtained via pruning and knowledge distillationβ355Updated last week
- β225Updated 3 weeks ago
- π Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.β216Updated last week
- Megatron's multi-modal data loaderβ266Updated this week
- A safetensors extension to efficiently store sparse quantized tensors on diskβ204Updated this week
- Easy and lightning fast training of π€ Transformers on Habana Gaudi processor (HPU)β201Updated this week
- ποΈ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of Oβ¦β317Updated last month
- π· Build compute kernelsβ171Updated this week
- β267Updated this week
- Provides end-to-end model development pipelines for LLMs and Multimodal models that can be launched on-prem or cloud-native.β508Updated 6 months ago
- Simple & Scalable Pretraining for Neural Architecture Researchβ299Updated 2 weeks ago
- Efficient LLM Inference over Long Sequencesβ390Updated 4 months ago
- PyTorch building blocks for the OLMo ecosystemβ317Updated this week
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.β202Updated last year
- PyTorch-native post-training at scaleβ509Updated this week