usefulsensors / useful-transformers
Efficient Inference of Transformer models
☆391Updated 3 months ago
Related projects ⓘ
Alternatives and complementary repositories for useful-transformers
- ☆413Updated 2 weeks ago
- Run Large Language Models on RK3588 with GPU-acceleration☆86Updated last year
- Easy usage of Rockchip's NPUs found in RK3588 and similar chips☆96Updated last week
- My develoopment fork of llama.cpp. For now working on RK3588 NPU and Tenstorrent backend☆67Updated this week
- Easier usage of LLMs in Rockchip's NPU on SBCs like Orange Pi 5 and Radxa Rock 5 series☆68Updated 2 weeks ago
- OpenAI Whisper for edge devices☆115Updated last year
- Robust Speech Recognition via Large-Scale Weak Supervision☆69Updated last year
- Inference Vision Transformer (ViT) in plain C/C++ with ggml☆233Updated 7 months ago
- Reverse engineering the rk3588 npu☆63Updated 5 months ago
- LLaVA server (llama.cpp).☆177Updated last year
- Improving transcription performance of OpenAI Whisper for CPU based deployment☆237Updated 2 years ago
- A ggml (C++) re-implementation of tortoise-tts☆159Updated 3 months ago
- Streaming TTS based on Piper with optional RK3588 NPU support☆44Updated last month
- An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting Multiple Inference Engine☆312Updated 2 months ago
- Python bindings for ggml☆132Updated 2 months ago
- CLIP inference in plain C/C++ with no extra dependencies☆459Updated 3 months ago
- Suno AI's Bark model in C/C++ for fast text-to-speech generation☆730Updated this week
- Supporting PyTorch models with the Google AI Edge TFLite runtime.☆371Updated this week
- ONNX implementation of Whisper. PyTorch free.☆85Updated 3 months ago
- An innovative library for efficient LLM inference via low-bit quantization☆348Updated 2 months ago
- Minimal extension of OpenAI's Whisper adding speaker diarization with special tokens☆444Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMs☆253Updated last month
- On-device LLM Inference Powered by X-Bit Quantization☆189Updated this week
- top-like script for rockhip NPUs on linux☆25Updated 2 weeks ago
- Generative AI extensions for onnxruntime☆514Updated this week
- Using FastChat-T5 Large Language Model, Vosk API for automatic speech recognition, and Piper for text-to-speech☆111Updated last year
- Optimized OpenAI's Whisper TFLite Port for Efficient Offline Inference on Edge Devices☆178Updated 2 months ago
- Pure C++ implementation of several models for real-time chatting on your computer (CPU)☆409Updated this week
- Open source repo for AI in a Box.☆56Updated 7 months ago
- ☆459Updated 4 months ago