XuehaiPan / nvitopLinks
An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.
☆6,380Updated this week
Alternatives and similar repositories for nvitop
Users that are interested in nvitop are comparing it to the libraries listed below
Sorting:
- GPU & Accelerator process monitoring for AMD, Apple, Huawei, Intel, NVIDIA and Qualcomm☆9,853Updated last month
- Fast and memory-efficient exact attention☆20,904Updated last week
- 🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (i…☆9,348Updated last week
- Multi-GPU CUDA stress test☆2,025Updated last month
- 📊 A simple command-line utility for querying and monitoring GPU status☆4,308Updated 8 months ago
- Accessible large language models via k-bit quantization for PyTorch.☆7,801Updated last week
- Simple, safe way to store and distribute tensors☆3,547Updated this week
- Hackable and optimized Transformers building blocks, supporting a composable construction.☆10,157Updated last week
- Ongoing research training transformer models at scale☆14,493Updated this week
- Transformer related optimization, including BERT, GPT☆6,365Updated last year
- PyTorch extensions for high performance and large scale training.☆3,388Updated 7 months ago
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.☆40,961Updated this week
- LMDeploy is a toolkit for compressing, deploying, and serving LLMs.☆7,377Updated this week
- View model summaries in PyTorch!☆2,883Updated last week
- Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)☆9,312Updated 2 weeks ago
- [MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration☆3,386Updated 4 months ago
- A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on H…☆2,993Updated this week
- 🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.☆20,268Updated this week
- Running large language models on a single GPU for throughput-oriented scenarios.☆9,380Updated last year
- Example models using DeepSpeed☆6,747Updated last month
- Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.☆4,237Updated last month
- SGLang is a fast serving framework for large language models and vision language models.☆21,166Updated this week
- The Fast Cross-Platform Package Manager☆7,819Updated this week
- AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (N…☆4,696Updated last month
- [NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.☆24,159Updated last year
- High-speed Large Language Model Serving for Local Deployment☆8,434Updated 4 months ago
- PyTorch native post-training library☆5,615Updated this week
- ☆4,110Updated last year
- A conda-forge distribution.☆8,982Updated last week
- RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable)…☆14,203Updated last month