usefulsensors / useful-transformers
Efficient Inference of Transformer models
☆353Updated last month
Related projects: ⓘ
- Run Large Language Models on RK3588 with GPU-acceleration☆79Updated last year
- ☆331Updated 4 months ago
- My develoopment fork of llama.cpp. For now working on RK3588 NPU and Tenstorrent backend☆62Updated last week
- CLIP inference in plain C/C++ with no extra dependencies☆430Updated last month
- Reverse engineering the rk3588 npu☆58Updated 3 months ago
- Easier usage of LLMs in Rockchip's NPU on SBCs like Orange Pi 5 and Radxa Rock 5 series☆52Updated last month
- Easy usage of Rockchip's NPUs found in RK3588 and similar chips☆80Updated 2 months ago
- An innovative library for efficient LLM inference via low-bit quantization☆342Updated 2 weeks ago
- Suno AI's Bark model in C/C++ for fast text-to-speech☆684Updated 2 months ago
- A ggml (C++) re-implementation of tortoise-tts☆147Updated 3 weeks ago
- Inference Vision Transformer (ViT) in plain C/C++ with ggml☆215Updated 5 months ago
- Python bindings for ggml☆125Updated 2 weeks ago
- Using FastChat-T5 Large Language Model, Vosk API for automatic speech recognition, and Piper for text-to-speech☆97Updated last year
- Robust Speech Recognition via Large-Scale Weak Supervision☆65Updated last year
- ggml implementation of BERT☆460Updated 6 months ago
- Optimized local inference for LLMs with HuggingFace-like APIs for quantization, vision/language models, multimodal agents, speech, vector…☆171Updated 2 weeks ago
- Pybind11 bindings for Whisper.cpp☆321Updated this week
- LLaVA server (llama.cpp).☆173Updated 10 months ago
- Improving transcription performance of OpenAI Whisper for CPU based deployment☆234Updated last year
- Streaming TTS based on Piper with optional RK3588 NPU support☆36Updated last month
- LLaMa/RWKV onnx models, quantization and testcase☆345Updated last year
- An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting Multiple Inference Engine☆276Updated 3 weeks ago
- Falcon LLM ggml framework with CPU and GPU support☆245Updated 7 months ago
- Extend the original llama.cpp repo to support redpajama model.☆117Updated 2 weeks ago
- LLM-based code completion engine☆172Updated last year
- Python bindings for whisper.cpp☆150Updated this week
- C++ implementation for 💫StarCoder☆443Updated last year
- openvino version of openai/whisper☆157Updated 10 months ago
- Optimized OpenAI's Whisper TFLite Port for Efficient Offline Inference on Edge Devices☆148Updated 3 weeks ago
- ☆478Updated 2 weeks ago