NVIDIA / NeMo-Run

A tool to configure, launch and manage your machine learning experiments.

☆146

Alternatives and similar repositories for NeMo-Run

Users that are interested in NeMo-Run are comparing it to the libraries listed below

Sorting:

foundation-model-stack / fms-fsdp
🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash…
☆245Updated this week
pytorch / torchft
PyTorch per step fault tolerance (actively under development)
☆300Updated this week
neuralmagic / nm-vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
☆263Updated 7 months ago
facebookresearch / spdl
Scalable and Performant Data Loading
☆258Updated this week
huggingface / kernels
Load compute kernels from the Hub
☆119Updated last week
huggingface / optimum-habana
Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)
☆186Updated this week
foundation-model-stack / foundation-model-stack
🚀 Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.
☆195Updated this week
NVIDIA / nvidia-resiliency-ext
NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the …
☆159Updated this week
huggingface / optimum-tpu
Google TPU optimizations for transformers models
☆109Updated 3 months ago
NetEase-FuXi / EETQ
Easy and Efficient Quantization for Transformers
☆197Updated 3 months ago
huggingface / llm-swarm
Manage scalable open LLM inference endpoints in Slurm clusters
☆256Updated 10 months ago
AI-Hypercomputer / jetstream-pytorch
PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"
☆60Updated last month
pytorch-labs / float8_experimental
This repository contains the experimental PyTorch native float8 training UX
☆224Updated 9 months ago
NVIDIA / Megatron-Energon
Megatron's multi-modal data loader
☆197Updated last week
NVlabs / Minitron
A family of compressed models obtained via pruning and knowledge distillation
☆336Updated 6 months ago
pytorch-labs / applied-ai
Applied AI experiments and examples for PyTorch
☆265Updated 2 weeks ago
apple / ml-recurrent-drafter
☆209Updated 3 months ago
mobiusml / gemlite
Fast low-bit matmul kernels in Triton
☆299Updated this week
ServiceNow / Fast-LLM
Accelerating your LLM training to full speed! Made with ❤️ by ServiceNow Research
☆195Updated this week
NVIDIA / Star-Attention
Efficient LLM Inference over Long Sequences
☆373Updated 2 weeks ago
triton-inference-server / vllm_backend
☆255Updated last week
triton-inference-server / pytorch_backend
The Triton backend for the PyTorch TorchScript models.
☆150Updated last week
anyscale / llm-continuous-batching-benchmarks
☆117Updated last year
huggingface / inference-benchmarker
Inference server benchmarking tool
☆59Updated 3 weeks ago
neuralmagic / compressed-tensors
A safetensors extension to efficiently store sparse quantized tensors on disk
☆109Updated last week
NVIDIA / NeMo-Aligner
Scalable toolkit for efficient model alignment
☆794Updated 2 weeks ago
AI-Hypercomputer / JetStream
JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs wel…
☆325Updated this week
huggingface / optimum-neuron
Easy, fast and very cheap training and inference on AWS Trainium and Inferentia chips.
☆229Updated this week
neuralmagic / AutoFP8
☆190Updated last week
google / saxml
☆138Updated 2 weeks ago