arlo-phoenix / CTranslate2-rocmLinks
Fast inference engine for Transformer models
☆37Updated 6 months ago
Alternatives and similar repositories for CTranslate2-rocm
Users that are interested in CTranslate2-rocm are comparing it to the libraries listed below
Sorting:
- An optimized quantization and inference library for running LLMs locally on modern consumer-class GPUs☆385Updated this week
- ☆75Updated this week
- Linux based GDDR6/GDDR6X VRAM temperature reader for NVIDIA RTX 3000/4000 series GPUs.☆98Updated last month
- Lightweight Inference server for OpenVINO☆180Updated this week
- Croco.Cpp is a 3rd party testground for KoboldCPP, a simple one-file way to run various GGML/GGUF models with KoboldAI's UI. (for Croco.C…☆107Updated this week
- ☆326Updated 2 months ago
- A daemon that automatically manages the performance states of NVIDIA GPUs.☆86Updated last month
- Comparison of the output quality of quantization methods, using Llama 3, transformers, GGUF, EXL2.☆153Updated last year
- Stable Diffusion and Flux in pure C/C++☆15Updated this week
- The official API server for Exllama. OAI compatible, lightweight, and fast.☆969Updated last week
- AMD (Radeon GPU) ROCm based setup for popular AI tools on Ubuntu 24.04.1☆204Updated 3 months ago
- My personal fork of koboldcpp where I hack in experimental samplers.☆46Updated last year
- A utility that uses Whisper to transcribe videos and various translation APIs to translate the transcribed text and save them as SRT (sub…☆72Updated 9 months ago
- Web UI for ExLlamaV2☆495Updated 4 months ago
- Core, Junction, and VRAM temperature reader for Linux + GDDR6/GDDR6X GPUs☆42Updated 3 weeks ago
- llama.cpp fork with additional SOTA quants and improved performance☆548Updated this week
- Input text from speech in any Linux window, the lean, fast and accurate way, using whisper.cpp OFFLINE. Speak with local LLMs via llama.c…☆103Updated last week
- A zero dependency web UI for any LLM backend, including KoboldCpp, OpenAI and AI Horde☆123Updated this week
- AI Inferencing at the Edge. A simple one-file way to run various GGML models with KoboldAI's UI with AMD ROCm offloading☆624Updated last week
- Easy to use interface for the Whisper model optimized for all GPUs!☆214Updated last week
- ☆96Updated last year
- Simple monkeypatch to boost AMD Navi 3 GPUs☆42Updated last month
- ROCm docker images with fixes/support for extra architectures, such as gfx803/gfx1010.☆30Updated last year
- LLM Frontend in a single html file☆487Updated 4 months ago
- 8-bit CUDA functions for PyTorch☆53Updated 3 weeks ago
- Suno AI's Bark model in C/C++ for fast text-to-speech generation☆824Updated 6 months ago
- ☆41Updated 2 years ago
- A simple FastAPI Server to run XTTSv2☆516Updated 10 months ago
- AI Powered search tool offers content-based, text, and visual similarity system-wide search.☆247Updated last week
- 8-bit CUDA functions for PyTorch Rocm compatible☆41Updated last year