microsoft / onnxruntime-web-benchmark
ONNX Runtime Web benchmark tool
☆8Updated last year
Alternatives and similar repositories for onnxruntime-web-benchmark:
Users that are interested in onnxruntime-web-benchmark are comparing it to the libraries listed below
- GGML implementation of BERT model with Python bindings and quantization.☆27Updated 11 months ago
- ☆49Updated last month
- Inference Llama 2 in one file of zero-dependency, zero-unsafe Rust☆37Updated last year
- Run LLMs on Replicate with vLLM☆15Updated 3 months ago
- GPU accelerated client-side embeddings for vector search, RAG etc.☆66Updated last year
- ☆36Updated 2 years ago
- Run ONNX RWKV-v4 models with GPU acceleration using DirectML [Windows], or just on CPU [Windows AND Linux]; Limited to 430M model at this…☆20Updated last year
- ONNX implementation of Whisper. PyTorch free.☆88Updated last month
- wasm bindings for huggingface tokenizers library☆35Updated 2 years ago
- ☆52Updated 8 months ago
- ☆25Updated last month
- implementation of https://arxiv.org/pdf/2312.09299☆20Updated 6 months ago
- A converter and basic tester for rwkv onnx☆42Updated 11 months ago
- Neural Network Execution Service☆11Updated last year
- Accelerated inference of 🤗 models using FuriosaAI NPU chips.☆26Updated 7 months ago
- Experiments with BitNet inference on CPU☆52Updated 9 months ago
- Inference Llama 2 in one file of pure JavaScript(HTML)☆30Updated 6 months ago
- LLama implementations benchmarking framework☆12Updated last year
- ☆19Updated 5 months ago
- ☆12Updated last year
- ⚡Delightful WebNN resources, curated list of awesome things around WebNN ecosystem.😎☆51Updated 2 months ago
- Web browser version of StarCoder.cpp☆43Updated last year
- Rust bindings for CTranslate2☆14Updated last year
- An open-source replication and extension of the Meta AI's LLAMA dataset☆24Updated last year
- Babylon.cpp is a C and C++ library for grapheme to phoneme conversion and text to speech synthesis. For phonemization a ONNX runtime port…☆15Updated 4 months ago
- Tutorial on how to convert machine learned models into ONNX☆16Updated last year
- Course Project for COMP4471 on RWKV☆16Updated 11 months ago
- GGML implementation of BERT model with Python bindings and quantization.☆52Updated 10 months ago
- SparseGPT + GPTQ Compression of LLMs like LLaMa, OPT, Pythia☆41Updated last year
- Open deep learning compiler stack for cpu, gpu and specialized accelerators☆34Updated 2 years ago