microsoft / onnxruntime-web-benchmark
ONNX Runtime Web benchmark tool
☆8Updated last year
Alternatives and similar repositories for onnxruntime-web-benchmark:
Users that are interested in onnxruntime-web-benchmark are comparing it to the libraries listed below
- GPU accelerated client-side embeddings for vector search, RAG etc.☆66Updated last year
- Inference Llama 2 in one file of zero-dependency, zero-unsafe Rust☆37Updated last year
- SGLang is fast serving framework for large language models and vision language models.☆19Updated last month
- ☆37Updated 2 years ago
- ☆52Updated last week
- ☆25Updated 3 months ago
- implementation of https://arxiv.org/pdf/2312.09299☆20Updated 8 months ago
- ☆79Updated this week
- Babylon.cpp is a C and C++ library for grapheme to phoneme conversion and text to speech synthesis. For phonemization a ONNX runtime port…☆16Updated 6 months ago
- ANE accelerated embedding models!☆17Updated 3 months ago
- Rust bindings for CTranslate2☆14Updated last year
- Simple video upscaler extending on the tile upscaler from https://huggingface.co/spaces/gokaygokay/Tile-Upscaler☆14Updated 8 months ago
- RWKV-7: Surpassing GPT☆82Updated 4 months ago
- python bindings for symphonia/opus - read various audio formats from python and write opus files☆33Updated 2 weeks ago
- Showcase how mxbai-embed-large-v1 can be used to produce binary embedding. Binary embeddings enabled 32x storage savings and 40x faster r…☆15Updated last year
- ☆52Updated 11 months ago
- Sentence Embedding as a Service☆15Updated last year
- Gradio UI for a Cog API☆66Updated 11 months ago
- Experiments with BitNet inference on CPU☆53Updated 11 months ago
- Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization, with PyTorch/CUDA☆36Updated last year
- Training hybrid models for dummies.☆20Updated 2 months ago
- ☆12Updated last year
- LLama implementations benchmarking framework☆12Updated last year
- A converter and basic tester for rwkv onnx☆42Updated last year