onnx / turnkeyml
The no-code AI toolchain
☆63Updated last week
Related projects: ⓘ
- ONNX Script enables developers to naturally author ONNX functions and models using a subset of Python.☆270Updated this week
- ☆148Updated this week
- AMD related optimizations for transformer models☆46Updated this week
- Notes and artifacts from the ONNX steering committee☆24Updated this week
- Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…☆56Updated 3 weeks ago
- Common utilities for ONNX converters☆245Updated 2 months ago
- AMD's graph optimization engine.☆183Updated this week
- Repository of model demos using TT-Buda☆54Updated this week
- Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)☆144Updated this week
- Development repository for the Triton language and compiler☆86Updated this week
- ☆80Updated 3 months ago
- Repository for the QUIK project, enabling the use of 4bit kernels for generative inference☆167Updated 5 months ago
- Home for OctoML PyTorch Profiler☆105Updated last year
- Inference Vision Transformer (ViT) in plain C/C++ with ggml☆215Updated 5 months ago
- hipBLASLt is a library that provides general matrix-matrix operations with a flexible API and extends functionalities beyond a traditiona…☆48Updated this week
- Advanced Quantization Algorithm for LLMs. This is official implementation of "Optimize Weight Rounding via Signed Gradient Descent for t…☆205Updated this week
- ☆110Updated 4 months ago
- An innovative library for efficient LLM inference via low-bit quantization☆342Updated 2 weeks ago
- ☆44Updated this week
- NVIDIA DLA-SW, the recipes and tools for running deep learning workloads on NVIDIA DLA cores for inference applications.☆174Updated 3 months ago
- OpenAI Triton backend for Intel® GPUs☆126Updated this week
- ☆57Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMs☆250Updated this week
- A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")☆250Updated this week
- QONNX: Arbitrary-Precision Quantized Neural Networks in ONNX☆121Updated this week
- 🤗 Optimum Intel: Accelerate inference with Intel optimization tools☆380Updated this week
- onnxruntime-extensions: A specialized pre- and post- processing library for ONNX Runtime☆319Updated this week
- Supporting PyTorch models with the Google AI Edge TFLite runtime.☆287Updated this week
- ☆41Updated 3 months ago
- Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators☆293Updated this week