justinchuby / onnx-safetensors
Use safetensors with ONNX 🤗
☆25Updated 2 months ago
Related projects ⓘ
Alternatives and complementary repositories for onnx-safetensors
- ONNX Adapter for model-explorer☆25Updated last month
- The no-code AI toolchain☆74Updated 3 weeks ago
- ONNX Script enables developers to naturally author ONNX functions and models using a subset of Python.☆282Updated this week
- onnxruntime-extensions: A specialized pre- and post- processing library for ONNX Runtime☆334Updated this week
- Module, Model, and Tensor Serialization/Deserialization☆187Updated 3 weeks ago
- OpenVINO Tokenizers extension☆24Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆250Updated last month
- Common utilities for ONNX converters☆251Updated 4 months ago
- Inference Vision Transformer (ViT) in plain C/C++ with ggml☆229Updated 7 months ago
- AMD related optimizations for transformer models☆57Updated last week
- Model compression for ONNX☆74Updated last month
- A general 2-8 bits quantization toolbox with GPTQ/AWQ/HQQ, and export to onnx/onnx-runtime easily.☆148Updated last month
- python package of rocm-smi-lib☆18Updated last month
- Python bindings for ggml☆132Updated 2 months ago
- Development repository for the Triton language and compiler☆93Updated this week
- Repository of model demos using TT-Buda☆55Updated 2 weeks ago
- Generative AI extensions for onnxruntime☆504Updated this week
- Google TPU optimizations for transformers models☆74Updated last week
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆89Updated this week
- Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)☆153Updated this week
- QuIP quantization☆46Updated 7 months ago
- asynchronous/distributed speculative evaluation for llama3☆37Updated 3 months ago
- An efficent implementation of the method proposed in "The Era of 1-bit LLMs"☆154Updated 3 weeks ago
- ☆116Updated last week
- Universal cross-platform tokenizers binding to HF and sentencepiece☆273Updated 3 months ago
- A safetensors extension to efficiently store sparse quantized tensors on disk☆46Updated this week
- ☆156Updated last month
- AMD's graph optimization engine.☆185Updated this week
- ☆152Updated this week