NVIDIA / NeMo-Run
A tool to configure, launch and manage your machine learning experiments.
β146Updated this week
Alternatives and similar repositories for NeMo-Run
Users that are interested in NeMo-Run are comparing it to the libraries listed below
Sorting:
- π Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flashβ¦β245Updated this week
- PyTorch per step fault tolerance (actively under development)β300Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMsβ263Updated 7 months ago
- Scalable and Performant Data Loadingβ258Updated this week
- Load compute kernels from the Hubβ119Updated last week
- Easy and lightning fast training of π€ Transformers on Habana Gaudi processor (HPU)β186Updated this week
- π Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.β195Updated this week
- NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the β¦β159Updated this week
- Google TPU optimizations for transformers modelsβ109Updated 3 months ago
- Easy and Efficient Quantization for Transformersβ197Updated 3 months ago
- Manage scalable open LLM inference endpoints in Slurm clustersβ256Updated 10 months ago
- PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"β60Updated last month
- This repository contains the experimental PyTorch native float8 training UXβ224Updated 9 months ago
- Megatron's multi-modal data loaderβ197Updated last week
- A family of compressed models obtained via pruning and knowledge distillationβ336Updated 6 months ago
- Applied AI experiments and examples for PyTorchβ265Updated 2 weeks ago
- β209Updated 3 months ago
- Fast low-bit matmul kernels in Tritonβ299Updated this week
- Accelerating your LLM training to full speed! Made with β€οΈ by ServiceNow Researchβ195Updated this week
- Efficient LLM Inference over Long Sequencesβ373Updated 2 weeks ago
- β255Updated last week
- The Triton backend for the PyTorch TorchScript models.β150Updated last week
- β117Updated last year
- Inference server benchmarking toolβ59Updated 3 weeks ago
- A safetensors extension to efficiently store sparse quantized tensors on diskβ109Updated last week
- Scalable toolkit for efficient model alignmentβ794Updated 2 weeks ago
- JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welβ¦β325Updated this week
- Easy, fast and very cheap training and inference on AWS Trainium and Inferentia chips.β229Updated this week
- β190Updated last week
- β138Updated 2 weeks ago