NVIDIA-NeMo / RunLinks
A tool to configure, launch and manage your machine learning experiments.
β215Updated last week
Alternatives and similar repositories for Run
Users that are interested in Run are comparing it to the libraries listed below
Sorting:
- π Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flashβ¦β279Updated 2 months ago
- Pytorch Distributed native training library for LLMs/VLMs with OOTB Hugging Face supportβ266Updated this week
- Scalable and Performant Data Loadingβ364Updated this week
- Load compute kernels from the Hubβ389Updated this week
- Fault tolerance for PyTorch (HSDP, LocalSGD, DiLoCo, Streaming DiLoCo)β474Updated 3 weeks ago
- A high-throughput and memory-efficient inference and serving engine for LLMsβ267Updated 2 months ago
- Google TPU optimizations for transformers modelsβ135Updated last week
- Efficient LLM Inference over Long Sequencesβ394Updated 7 months ago
- β219Updated last year
- Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024β356Updated last week
- π Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.β219Updated last week
- ποΈ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of Oβ¦β327Updated 4 months ago
- Easy and lightning fast training of π€ Transformers on Habana Gaudi processor (HPU)β205Updated this week
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.β201Updated last year
- ArcticInference: vLLM plugin for high-throughput, low-latency inferenceβ384Updated this week
- Megatron's multi-modal data loaderβ315Updated last week
- A family of compressed models obtained via pruning and knowledge distillationβ364Updated 2 months ago
- ArcticTraining is a framework designed to simplify and accelerate the post-training process for large language models (LLMs)β272Updated last week
- Simple & Scalable Pretraining for Neural Architecture Researchβ307Updated last month
- Easy and Efficient Quantization for Transformersβ204Updated last week
- β278Updated 2 weeks ago
- PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"β79Updated last month
- Module, Model, and Tensor Serialization/Deserializationβ286Updated 5 months ago
- LM engine is a library for pretraining/finetuning LLMsβ113Updated this week
- A unified library for building, evaluating, and storing speculative decoding algorithms for LLM inference in vLLMβ220Updated this week
- Accelerating your LLM training to full speed! Made with β€οΈ by ServiceNow Researchβ282Updated last week
- FlexAttention based, minimal vllm-style inference engine for fast Gemma 2 inference.β334Updated 3 months ago
- This repository contains the experimental PyTorch native float8 training UXβ227Updated last year
- OpenAI compatible API for TensorRT LLM triton backendβ220Updated last year
- β328Updated last week