onnx / turnkeyml
The no-code AI toolchain
☆89Updated last week
Alternatives and similar repositories for turnkeyml:
Users that are interested in turnkeyml are comparing it to the libraries listed below
- ONNX Script enables developers to naturally author ONNX functions and models using a subset of Python.☆317Updated this week
- AMD's graph optimization engine.☆208Updated this week
- hipBLASLt is a library that provides general matrix-matrix operations with a flexible API and extends functionalities beyond a traditiona…☆79Updated this week
- Common utilities for ONNX converters☆259Updated 2 months ago
- Fast low-bit matmul kernels in Triton☆236Updated this week
- High-Performance SGEMM on CUDA devices☆74Updated 3 weeks ago
- ☆58Updated last year
- ☆105Updated 3 months ago
- Unified compiler/runtime for interfacing with PyTorch Dynamo.☆100Updated this week
- AMD related optimizations for transformer models☆67Updated 3 months ago
- Model compression for ONNX☆86Updated 3 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆259Updated 4 months ago
- OpenAI Triton backend for Intel® GPUs☆165Updated this week
- An experimental CPU backend for Triton (https//github.com/openai/triton)☆38Updated 9 months ago
- BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.☆523Updated last week
- Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…☆60Updated 2 months ago
- Advanced Quantization Algorithm for LLMs/VLMs.☆372Updated this week
- An innovative library for efficient LLM inference via low-bit quantization☆351Updated 5 months ago
- Notes and artifacts from the ONNX steering committee☆25Updated this week
- LLM training in simple, raw C/CUDA☆91Updated 9 months ago
- ☆34Updated this week
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆88Updated this week
- ☆157Updated last week
- onnxruntime-extensions: A specialized pre- and post- processing library for ONNX Runtime☆360Updated this week
- Development repository for the Triton language and compiler☆107Updated this week
- An efficient GPU support for LLM inference with x-bit quantization (e.g. FP6,FP5).☆234Updated 3 months ago
- OpenVINO Tokenizers extension☆29Updated this week
- A GPU-driven system framework for scalable AI applications☆112Updated 2 weeks ago
- TORCH_LOGS parser for PT2☆32Updated this week
- ☆197Updated 3 weeks ago