NVIDIA / NeMo-Run
A tool to configure, launch and manage your machine learning experiments.
β107Updated this week
Alternatives and similar repositories for NeMo-Run:
Users that are interested in NeMo-Run are comparing it to the libraries listed below
- π Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flashβ¦β218Updated this week
- Google TPU optimizations for transformers modelsβ90Updated last week
- A high-throughput and memory-efficient inference and serving engine for LLMsβ257Updated 3 months ago
- Megatron's multi-modal data loaderβ160Updated this week
- β192Updated last week
- Easy and Efficient Quantization for Transformersβ192Updated last month
- This repository contains the experimental PyTorch native float8 training UXβ219Updated 5 months ago
- π Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.β186Updated last week
- Applied AI experiments and examples for PyTorchβ216Updated last week
- Scalable and Performant Data Loadingβ211Updated this week
- PyTorch per step fault tolerance (actively under development)β226Updated this week
- Easy and lightning fast training of π€ Transformers on Habana Gaudi processor (HPU)β166Updated this week
- A project to improve skills of large language modelsβ239Updated this week
- Efficient LLM Inference over Long Sequencesβ349Updated last month
- A toolkit for processing speech data and creating speech datasetsβ104Updated this week
- Official repository for LightSeq: Sequence Level Parallelism for Distributed Training of Long Context Transformersβ204Updated 5 months ago
- β97Updated 5 months ago
- β218Updated this week
- Manage scalable open LLM inference endpoints in Slurm clustersβ249Updated 6 months ago
- β171Updated last week
- Fast low-bit matmul kernels in Tritonβ199Updated last week
- A safetensors extension to efficiently store sparse quantized tensors on diskβ66Updated this week
- Implementation of π Ring Attention, from Liu et al. at Berkeley AI, in Pytorchβ499Updated 3 months ago
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMsβ88Updated this week
- Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundryβ40Updated last year
- β85Updated 8 months ago
- Train, tune, and infer Bamba modelβ80Updated 2 weeks ago
- β58Updated 8 months ago
- β154Updated last month
- The Triton backend for the PyTorch TorchScript models.β141Updated last week