huggingface / optimum-onnxLinks
π€ Optimum ONNX: Export your model to ONNX and run inference with ONNX Runtime
β33Updated this week
Alternatives and similar repositories for optimum-onnx
Users that are interested in optimum-onnx are comparing it to the libraries listed below
Sorting:
- Hugging Face Inference Toolkit used to serve transformers, sentence-transformers, and diffusers models.β83Updated 2 weeks ago
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.β33Updated 3 months ago
- β64Updated last month
- β41Updated 3 months ago
- β80Updated 2 months ago
- python bindings for symphonia/opus - read various audio formats from python and write opus filesβ65Updated last month
- π€ Trade any tensors over the networkβ30Updated last year
- β49Updated 6 months ago
- β15Updated last year
- vLLM adapter for a TGIS-compatible gRPC server.β35Updated this week
- A toolkit implementing advanced methods to transfer models and model knowledge across tokenizers.β43Updated last month
- Small python package to measure OCR quality and other related metrics.β25Updated last year
- ML/DL Math and Method notesβ63Updated last year
- Trully flash implementation of DeBERTa disentangled attention mechanism.β63Updated last week
- Optimus is a flexible and scalable framework built to train language models efficiently across diverse hardware configurations, includingβ¦β66Updated last month
- PyLate efficient inference engineβ62Updated last month
- ππ€ A collection of templates for Hugging Face Spacesβ35Updated last year
- Google TPU optimizations for transformers modelsβ118Updated 7 months ago
- π· Build compute kernelsβ106Updated last week
- Repository containing the SPIN experiments on the DIBT 10k ranked promptsβ24Updated last year
- Experimental playground for benchmarking language model (LM) architectures, layers, and tricks on smaller datasets. Designed for flexibleβ¦β76Updated last month
- **ARCHIVED** Filesystem interface to π€ Hubβ58Updated 2 years ago
- Open TTS models, built for streaming on the edgeβ43Updated 5 months ago
- QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning Pβ¦β34Updated 2 years ago
- A fast RWKV Tokenizer written in Rustβ49Updated last week
- Pre-train Static Word Embeddingsβ85Updated 2 months ago
- β33Updated last month
- β51Updated 6 months ago
- Rust crate for some audio utilitiesβ26Updated 5 months ago
- Tiled Flash Linear Attention library for fast and efficient mLSTM Kernels.β69Updated last week