NVIDIA-NeMo / NeMoLinks
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
☆16,686Updated this week
Alternatives and similar repositories for NeMo
Users that are interested in NeMo are comparing it to the libraries listed below
Sorting:
- Fast and memory-efficient exact attention☆22,113Updated this week
- Accessible large language models via k-bit quantization for PyTorch.☆7,939Updated 2 weeks ago
- TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizat…☆12,811Updated this week
- Transformer related optimization, including BERT, GPT☆6,392Updated last year
- SGLang is a high-performance serving framework for large language models and multimodal models.☆23,091Updated this week
- Development repository for the Triton language and compiler☆18,319Updated this week
- Ongoing research training transformer models at scale☆15,100Updated this week
- The Triton Inference Server provides an optimized cloud and edge inferencing solution.☆10,298Updated last week
- 🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (i…☆9,486Updated this week
- Large Language Model Text Generation Inference☆10,749Updated 3 weeks ago
- Tensor library for machine learning☆13,907Updated last week
- End-to-End Speech Processing Toolkit☆9,711Updated last week
- Fast inference engine for Transformer models☆4,274Updated last week
- Tools for merging pretrained large language models.☆6,761Updated last week
- 🚀 Accelerate inference and training of 🤗 Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimization…☆3,277Updated 3 weeks ago
- QLoRA: Efficient Finetuning of Quantized LLMs☆10,830Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMs☆69,622Updated this week
- PyTorch native post-training library☆5,660Updated this week
- 💥 Fast State-of-the-Art Tokenizers optimized for Research and Production☆10,445Updated this week
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.☆41,509Updated this week
- Repo for external large-scale work☆6,543Updated last year
- Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities☆22,002Updated 2 weeks ago
- Facebook AI Research Sequence-to-Sequence Toolkit written in Python.☆32,125Updated 4 months ago
- ☆2,948Updated 3 weeks ago
- PyTorch extensions for high performance and large scale training.☆3,397Updated 9 months ago
- High-speed Large Language Model Serving for Local Deployment☆8,635Updated 2 weeks ago
- Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.☆6,180Updated 5 months ago
- 🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.☆20,587Updated this week
- Hackable and optimized Transformers building blocks, supporting a composable construction.☆10,326Updated this week
- Run, manage, and scale AI workloads on any AI infrastructure. Use one system to access & manage all AI compute (Kubernetes, 20+ clouds, o…☆9,418Updated this week