onnx / turnkeyml
The no-code AI toolchain
☆75Updated this week
Related projects ⓘ
Alternatives and complementary repositories for turnkeyml
- ONNX Script enables developers to naturally author ONNX functions and models using a subset of Python.☆288Updated this week
- Common utilities for ONNX converters☆252Updated 5 months ago
- ☆115Updated 7 months ago
- Notes and artifacts from the ONNX steering committee☆25Updated last week
- Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)☆153Updated this week
- AMD's graph optimization engine.☆187Updated this week
- AMD related optimizations for transformer models☆57Updated 2 weeks ago
- ☆152Updated this week
- Model compression for ONNX☆75Updated this week
- Repository of model demos using TT-Buda☆55Updated 3 weeks ago
- OpenAI Triton backend for Intel® GPUs☆143Updated this week
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆89Updated this week
- hipBLASLt is a library that provides general matrix-matrix operations with a flexible API and extends functionalities beyond a traditiona…☆63Updated this week
- QONNX: Arbitrary-Precision Quantized Neural Networks in ONNX☆127Updated 3 weeks ago
- ☆43Updated 5 months ago
- PyTorch emulation library for Microscaling (MX)-compatible data formats☆164Updated 2 months ago
- ☆88Updated last week
- SynapseAI Core is a reference implementation of the SynapseAI API running on Habana Gaudi☆37Updated last year
- Machine Learning Agility (MLAgility) benchmark and benchmarking tools☆38Updated 3 weeks ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆253Updated last month
- An experimental CPU backend for Triton (https//github.com/openai/triton)☆35Updated 6 months ago
- Home for OctoML PyTorch Profiler☆107Updated last year
- Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…☆57Updated 2 months ago
- Run Generative AI models with simple C++/Python API and using OpenVINO Runtime☆155Updated this week
- Advanced Quantization Algorithm for LLMs. This is official implementation of "Optimize Weight Rounding via Signed Gradient Descent for t…☆248Updated this week
- Repository for the QUIK project, enabling the use of 4bit kernels for generative inference - EMNLP 2024☆173Updated 7 months ago
- ONNX Adapter for model-explorer☆25Updated 2 months ago
- 🤗 Optimum Intel: Accelerate inference with Intel optimization tools☆410Updated this week
- ☆13Updated this week
- Standalone Flash Attention v2 kernel without libtorch dependency☆98Updated 2 months ago