NVIDIA / NeMo-Run
A tool to configure, launch and manage your machine learning experiments.
β65Updated this week
Related projects β
Alternatives and complementary repositories for NeMo-Run
- π Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flashβ¦β193Updated this week
- π Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.β165Updated this week
- Torch Distributed Experimentalβ116Updated 3 months ago
- Megatron's multi-modal data loaderβ136Updated this week
- β77Updated 5 months ago
- some common Huggingface transformers in maximal update parametrization (Β΅P)β76Updated 2 years ago
- Implementation of a Transformer, but completely in Tritonβ248Updated 2 years ago
- NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the β¦β38Updated last month
- This repository contains the experimental PyTorch native float8 training UXβ211Updated 3 months ago
- Various transformers for FSDP researchβ33Updated 2 years ago
- A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mindβ¦β146Updated 2 weeks ago
- β177Updated last week
- β64Updated 2 years ago
- β99Updated last month
- Applied AI experiments and examples for PyTorchβ166Updated 3 weeks ago
- A pipeline to improve skills of large language modelsβ191Updated this week
- β88Updated 2 months ago
- Google TPU optimizations for transformers modelsβ75Updated this week
- Scalable and Performant Data Loadingβ66Updated this week
- β153Updated this week
- β55Updated 5 months ago
- Transform datasets at scale. Optimize datasets for fast AI model training.β368Updated this week
- β224Updated 4 months ago
- Official repository for LightSeq: Sequence Level Parallelism for Distributed Training of Long Context Transformersβ195Updated 3 months ago
- Triton-based implementation of Sparse Mixture of Experts.β185Updated last month
- β122Updated this week
- LLM KV cache compression made easyβ64Updated last week
- ring-attention experimentsβ97Updated last month
- Fast, Modern, Memory Efficient, and Low Precision PyTorch Optimizersβ58Updated 4 months ago