quic / cloud-ai-sdkLinks
Qualcomm Cloud AI SDK (Platform and Apps) enable high performance deep learning inference on Qualcomm Cloud AI platforms delivering high throughput and low latency across Computer Vision, Object Detection, Natural Language Processing and Generative AI models.
☆61Updated 2 weeks ago
Alternatives and similar repositories for cloud-ai-sdk
Users that are interested in cloud-ai-sdk are comparing it to the libraries listed below
Sorting:
- This library empowers users to seamlessly port pretrained models and checkpoints on the HuggingFace (HF) hub (developed using HF transfor…☆68Updated this week
- Tutorials for running models on First-gen Gaudi and Gaudi2 for Training and Inference. The source files for the tutorials on https://dev…☆61Updated this week
- Triton CLI is an open source command line interface that enables users to create, deploy, and profile models served by the Triton Inferen…☆62Updated 3 weeks ago
- A curated collection of resources, tutorials, and best practices for learning and mastering NVIDIA CUTLASS☆183Updated last month
- CUDA Matrix Multiplication Optimization☆189Updated 10 months ago
- Slides, notes, and materials for the workshop☆326Updated last year
- Collection of kernels written in Triton language☆125Updated 2 months ago
- Triton Model Navigator is an inference toolkit designed for optimizing and deploying Deep Learning models with a focus on NVIDIA GPUs.☆202Updated last month
- Model compression for ONNX☆96Updated 6 months ago
- Fast low-bit matmul kernels in Triton☆311Updated this week
- The Triton backend for the ONNX Runtime.☆148Updated 3 weeks ago
- nvidia-modelopt is a unified library of state-of-the-art model optimization techniques like quantization, pruning, distillation, speculat…☆956Updated 2 weeks ago
- ☆215Updated this week
- Notes on quantization in neural networks☆83Updated last year
- ☆169Updated 5 months ago
- NVIDIA tools guide☆133Updated 5 months ago
- The Triton backend for the PyTorch TorchScript models.☆150Updated 3 weeks ago
- Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)☆186Updated this week
- ☆157Updated last year
- Applied AI experiments and examples for PyTorch☆271Updated last week
- ☆74Updated this week
- Experimental projects related to TensorRT☆105Updated last week
- Triton Model Analyzer is a CLI tool to help with better understanding of the compute and memory requirements of the Triton Inference Serv…☆478Updated this week
- cudnn_frontend provides a c++ wrapper for the cudnn backend API and samples on how to use it☆572Updated last week
- AI Edge Quantizer: flexible post training quantization for LiteRT models.☆41Updated last week
- Cataloging released Triton kernels.☆229Updated 4 months ago
- 🚀 Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.☆198Updated this week
- ☆140Updated 3 months ago
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆185Updated this week
- The Triton backend that allows running GPU-accelerated data pre-processing pipelines implemented in DALI's python API.☆134Updated this week