quic / cloud-ai-sdkLinks
Qualcomm Cloud AI SDK (Platform and Apps) enable high performance deep learning inference on Qualcomm Cloud AI platforms delivering high throughput and low latency across Computer Vision, Object Detection, Natural Language Processing and Generative AI models.
☆71Updated 2 months ago
Alternatives and similar repositories for cloud-ai-sdk
Users that are interested in cloud-ai-sdk are comparing it to the libraries listed below
Sorting:
- This library empowers users to seamlessly port pretrained models and checkpoints on the HuggingFace (HF) hub (developed using HF transfor…☆85Updated last week
- Run Generative AI models with simple C++/Python API and using OpenVINO Runtime☆428Updated last week
- Qualcomm® AI Hub Models is our collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) an…☆915Updated 2 weeks ago
- 🤗 Optimum Intel: Accelerate inference with Intel optimization tools☆532Updated last week
- This repository contains tutorials and examples for Triton Inference Server☆819Updated this week
- AI Edge Quantizer: flexible post training quantization for LiteRT models.☆96Updated last week
- Notes on quantization in neural networks☆117Updated 2 years ago
- Slides, notes, and materials for the workshop☆339Updated last year
- Model Compression Toolkit (MCT) is an open source project for neural network model optimization under efficient, constrained hardware. Th…☆432Updated this week
- Tutorials for running models on First-gen Gaudi and Gaudi2 for Training and Inference. The source files for the tutorials on https://dev…☆64Updated 4 months ago
- Support PyTorch model conversion with LiteRT.☆935Updated this week
- A curated list of OpenVINO based AI projects☆181Updated 7 months ago
- ☆134Updated last week
- Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)☆205Updated last week
- ☆181Updated 2 weeks ago
- ONNX Script enables developers to naturally author ONNX functions and models using a subset of Python.☆420Updated last week
- Some CUDA example code with READMEs.☆179Updated 2 months ago
- 🏋️ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of O…☆327Updated 4 months ago
- A unified library of SOTA model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresse…☆1,964Updated this week
- Triton CLI is an open source command line interface that enables users to create, deploy, and profile models served by the Triton Inferen…☆73Updated last week
- ☆328Updated this week
- The Triton backend for the ONNX Runtime.☆173Updated this week
- ☆137Updated last week
- 🎯An accuracy-first, highly efficient quantization toolkit for LLMs, designed to minimize quality degradation across Weight-Only Quantiza…☆845Updated this week
- Visualize ONNX models with model-explorer☆67Updated last month
- QONNX: Arbitrary-Precision Quantized Neural Networks in ONNX☆175Updated last week
- Inference Vision Transformer (ViT) in plain C/C++ with ggml☆306Updated last year
- Model compression for ONNX☆98Updated last year
- Pre-built components and code samples to help you build and deploy production-grade AI applications with the OpenVINO™ Toolkit from Intel☆202Updated this week
- This repository is a read-only mirror of https://gitlab.arm.com/kleidi/kleidiai☆113Updated this week