NVIDIA-NeMo / RunLinks
A tool to configure, launch and manage your machine learning experiments.
β216Updated this week
Alternatives and similar repositories for Run
Users that are interested in Run are comparing it to the libraries listed below
Sorting:
- π Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flashβ¦β279Updated 2 months ago
- Pytorch Distributed native training library for LLMs/VLMs with OOTB Hugging Face supportβ266Updated this week
- Load compute kernels from the Hubβ389Updated last week
- A high-throughput and memory-efficient inference and serving engine for LLMsβ267Updated 2 months ago
- Fault tolerance for PyTorch (HSDP, LocalSGD, DiLoCo, Streaming DiLoCo)β474Updated 3 weeks ago
- Scalable and Performant Data Loadingβ364Updated this week
- β219Updated last year
- π Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.β219Updated this week
- Google TPU optimizations for transformers modelsβ135Updated 2 weeks ago
- ArcticTraining is a framework designed to simplify and accelerate the post-training process for large language models (LLMs)β273Updated this week
- Easy and lightning fast training of π€ Transformers on Habana Gaudi processor (HPU)β205Updated this week
- Efficient LLM Inference over Long Sequencesβ394Updated 7 months ago
- Accelerating your LLM training to full speed! Made with β€οΈ by ServiceNow Researchβ287Updated this week
- This repository contains the experimental PyTorch native float8 training UXβ227Updated last year
- π· Build compute kernelsβ214Updated last week
- ArcticInference: vLLM plugin for high-throughput, low-latency inferenceβ384Updated this week
- Where GPUs get cooked π©βπ³π₯β362Updated 2 weeks ago
- Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024β356Updated 2 weeks ago
- FlexAttention based, minimal vllm-style inference engine for fast Gemma 2 inference.β334Updated 3 months ago
- β579Updated 4 months ago
- PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"β79Updated last month
- A unified library for building, evaluating, and storing speculative decoding algorithms for LLM inference in vLLMβ220Updated this week
- TPU inference for vLLM, with unified JAX and PyTorch support.β228Updated this week
- OpenAI compatible API for TensorRT LLM triton backendβ220Updated last year
- PyTorch-native post-training at scaleβ605Updated last week
- ποΈ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of Oβ¦β327Updated 4 months ago
- Applied AI experiments and examples for PyTorchβ315Updated 5 months ago
- Module, Model, and Tensor Serialization/Deserializationβ286Updated 5 months ago
- JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welβ¦β404Updated last month
- Megatron's multi-modal data loaderβ315Updated last week