quic / cloud-ai-sdkLinks
Qualcomm Cloud AI SDK (Platform and Apps) enable high performance deep learning inference on Qualcomm Cloud AI platforms delivering high throughput and low latency across Computer Vision, Object Detection, Natural Language Processing and Generative AI models.
☆61Updated last month
Alternatives and similar repositories for cloud-ai-sdk
Users that are interested in cloud-ai-sdk are comparing it to the libraries listed below
Sorting:
- This library empowers users to seamlessly port pretrained models and checkpoints on the HuggingFace (HF) hub (developed using HF transfor…☆71Updated this week
- Tutorials for running models on First-gen Gaudi and Gaudi2 for Training and Inference. The source files for the tutorials on https://dev…☆61Updated last week
- AI Edge Quantizer: flexible post training quantization for LiteRT models.☆49Updated this week
- OpenVINO backend for Triton.☆32Updated last week
- A unified library of state-of-the-art model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. …☆1,006Updated last week
- CUDA Matrix Multiplication Optimization☆196Updated 11 months ago
- SandLogic Lexicons☆19Updated 8 months ago
- Triton CLI is an open source command line interface that enables users to create, deploy, and profile models served by the Triton Inferen…☆64Updated 2 weeks ago
- ☆85Updated this week
- The Triton backend for the ONNX Runtime.☆153Updated last week
- Intel® End-to-End AI Optimization Kit☆32Updated 11 months ago
- Experimental projects related to TensorRT☆105Updated last week
- A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")☆339Updated this week
- Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)☆188Updated this week
- ONNX Script enables developers to naturally author ONNX functions and models using a subset of Python.☆360Updated this week
- PyTorch emulation library for Microscaling (MX)-compatible data formats☆251Updated last week
- Model compression for ONNX☆96Updated 7 months ago
- Triton Model Analyzer is a CLI tool to help with better understanding of the compute and memory requirements of the Triton Inference Serv…☆479Updated 2 weeks ago
- ☆159Updated last year
- Triton Model Navigator is an inference toolkit designed for optimizing and deploying Deep Learning models with a focus on NVIDIA GPUs.☆205Updated 2 months ago
- This repository contains tutorials and examples for Triton Inference Server☆724Updated 2 weeks ago
- ☆149Updated 2 years ago
- An experimental CPU backend for Triton (https//github.com/openai/triton)☆43Updated 3 months ago
- The Triton backend that allows running GPU-accelerated data pre-processing pipelines implemented in DALI's python API.☆135Updated 3 weeks ago
- NVIDIA DLA-SW, the recipes and tools for running deep learning workloads on NVIDIA DLA cores for inference applications.☆200Updated last year
- oneCCL Bindings for Pytorch*☆97Updated 2 months ago
- The Triton backend for TensorRT.☆77Updated last week
- Visualize ONNX models with model-explorer☆36Updated last month
- Integrating SSE with NVIDIA Triton Inference Server using a Python backend and Zephyr model. There is very less documentation how to use …☆10Updated last year
- Convert tflite to JSON and make it editable in the IDE. It also converts the edited JSON back to tflite binary.☆27Updated 2 years ago