justinchuby / onnx-safetensorsLinks
Use safetensors with ONNX 🤗
☆73Updated 3 weeks ago
Alternatives and similar repositories for onnx-safetensors
Users that are interested in onnx-safetensors are comparing it to the libraries listed below
Sorting:
- Model compression for ONNX☆97Updated 11 months ago
- Thin wrapper around GGML to make life easier☆40Updated 4 months ago
- No-code CLI designed for accelerating ONNX workflows☆215Updated 4 months ago
- ONNX Script enables developers to naturally author ONNX functions and models using a subset of Python.☆404Updated last week
- Python bindings for ggml☆146Updated last year
- Inference Vision Transformer (ViT) in plain C/C++ with ggml☆295Updated last year
- onnxruntime-extensions: A specialized pre- and post- processing library for ONNX Runtime☆418Updated last week
- A Toolkit to Help Optimize Onnx Model☆228Updated this week
- A safetensors extension to efficiently store sparse quantized tensors on disk☆180Updated last week
- 🤗 Optimum ONNX: Export your model to ONNX and run inference with ONNX Runtime☆70Updated last week
- 🤗 Optimum Intel: Accelerate inference with Intel optimization tools☆502Updated this week
- Visualize ONNX models with model-explorer☆62Updated 2 weeks ago
- 🤗 Optimum ExecuTorch☆74Updated this week
- AMD related optimizations for transformer models☆93Updated 2 weeks ago
- Common utilities for ONNX converters☆283Updated last month
- AI Edge Quantizer: flexible post training quantization for LiteRT models.☆72Updated last week
- TTS support with GGML☆184Updated 3 weeks ago
- GGUF parser in Python☆28Updated last year
- Module, Model, and Tensor Serialization/Deserialization☆270Updated 2 months ago
- An innovative library for efficient LLM inference via low-bit quantization☆349Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMs☆266Updated last year
- A minimalistic C++ Jinja templating engine for LLM chat templates☆193Updated last month
- Generative AI extensions for onnxruntime☆861Updated last week
- 👷 Build compute kernels☆163Updated this week
- PyTorch half precision gemm lib w/ fused optional bias + optional relu/gelu☆75Updated 10 months ago
- OpenVINO Tokenizers extension☆42Updated this week
- The Triton backend for the ONNX Runtime.☆162Updated 2 weeks ago
- Advanced Ultra-Low Bitrate Compression Techniques for the LLaMA Family of LLMs☆110Updated last year
- ☆258Updated last week
- ☆76Updated 10 months ago