justinchuby / onnx-safetensorsLinks
Use safetensors with ONNX 🤗
☆78Updated 2 months ago
Alternatives and similar repositories for onnx-safetensors
Users that are interested in onnx-safetensors are comparing it to the libraries listed below
Sorting:
- Model compression for ONNX☆99Updated last year
- Visualize ONNX models with model-explorer☆66Updated 2 weeks ago
- Inference Vision Transformer (ViT) in plain C/C++ with ggml☆304Updated last year
- onnxruntime-extensions: A specialized pre- and post- processing library for ONNX Runtime☆431Updated last week
- ONNX Script enables developers to naturally author ONNX functions and models using a subset of Python.☆414Updated last week
- Python bindings for ggml☆146Updated last year
- No-code CLI designed for accelerating ONNX workflows☆222Updated 6 months ago
- 🤗 Optimum ONNX: Export your model to ONNX and run inference with ONNX Runtime☆105Updated last week
- A safetensors extension to efficiently store sparse quantized tensors on disk☆225Updated last week
- A Toolkit to Help Optimize Onnx Model☆288Updated this week
- AI Edge Quantizer: flexible post training quantization for LiteRT models.☆84Updated last week
- 🤗 Optimum Intel: Accelerate inference with Intel optimization tools☆522Updated this week
- Thin wrapper around GGML to make life easier☆40Updated last month
- 🤗 Optimum ExecuTorch☆93Updated last week
- Common utilities for ONNX converters☆289Updated 2 weeks ago
- Generative AI extensions for onnxruntime☆911Updated this week
- An innovative library for efficient LLM inference via low-bit quantization☆351Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMs☆267Updated 3 weeks ago
- OpenVINO Tokenizers extension☆44Updated last week
- 👷 Build compute kernels☆195Updated last week
- Shrinks ONNX files by quantizing large float constants into eight bit equivalents.☆27Updated 3 weeks ago
- TTS support with GGML☆204Updated 2 months ago
- AMD related optimizations for transformer models☆96Updated 2 months ago
- Notes and artifacts from the ONNX steering committee☆27Updated last week
- python package of rocm-smi-lib☆24Updated 2 weeks ago
- An efficent implementation of the method proposed in "The Era of 1-bit LLMs"☆155Updated last year
- PyTorch half precision gemm lib w/ fused optional bias + optional relu/gelu☆75Updated last year
- A minimalistic C++ Jinja templating engine for LLM chat templates☆202Updated 3 months ago
- A general 2-8 bits quantization toolbox with GPTQ/AWQ/HQQ/VPTQ, and export to onnx/onnx-runtime easily.☆184Updated 8 months ago
- Efficient in-memory representation for ONNX, in Python☆37Updated last week