arlo-phoenix / CTranslate2-rocmLinks
Fast inference engine for Transformer models
☆47Updated 11 months ago
Alternatives and similar repositories for CTranslate2-rocm
Users that are interested in CTranslate2-rocm are comparing it to the libraries listed below
Sorting:
- ☆399Updated 6 months ago
- An optimized quantization and inference library for running LLMs locally on modern consumer-class GPUs☆513Updated last week
- AI Inferencing at the Edge. A simple one-file way to run various GGML models with KoboldAI's UI with AMD ROCm offloading☆699Updated last month
- llama.cpp fork with additional SOTA quants and improved performance☆1,246Updated this week
- The official API server for Exllama. OAI compatible, lightweight, and fast.☆1,061Updated this week
- AMD (Radeon GPU) ROCm based setup for popular AI tools on Ubuntu 24.04.1☆211Updated 3 weeks ago
- A complete package that provides you with all the components needed to get started of dive deeper into Machine Learning Workloads on Cons…☆38Updated last week
- The HIP Environment and ROCm Kit - A lightweight open source build system for HIP and ROCm☆438Updated this week
- A daemon that automatically manages the performance states of NVIDIA GPUs.☆95Updated 2 weeks ago
- llama-swap + a minimal ollama compatible api☆28Updated this week
- Inference engine for Intel devices. Serve LLMs, VLMs, Whisper, Kokoro-TTS over OpenAI endpoints.☆211Updated this week
- Simple monkeypatch to boost AMD Navi 3 GPUs☆46Updated 5 months ago
- ☆83Updated this week
- ROCm docker images with fixes/support for extra architectures, such as gfx803/gfx1010.☆31Updated 2 years ago
- LLM Frontend in a single html file☆647Updated 8 months ago
- ☆42Updated 2 years ago
- Model swapping for llama.cpp (or any local OpenAI API compatible server)☆1,655Updated this week
- Easy to use interface for the Whisper model optimized for all GPUs!☆320Updated 2 months ago
- Linux based GDDR6/GDDR6X VRAM temperature reader for NVIDIA RTX 3000/4000 series GPUs.☆104Updated 5 months ago
- Whisper command line client compatible with original OpenAI client based on CTranslate2.☆1,116Updated 2 months ago
- Croco.Cpp is fork of KoboldCPP infering GGML/GGUF models on CPU/Cuda with KoboldAI's UI. It's powered partly by IK_LLama.cpp, and compati…☆147Updated this week
- Run LLMs on AMD Ryzen™ AI NPUs in minutes. Just like Ollama - but purpose-built and deeply optimized for the AMD NPUs.☆280Updated this week
- Input text from speech in any Linux window, the lean, fast and accurate way, using whisper.cpp OFFLINE. Speak with local LLMs via llama.c…☆142Updated 2 months ago
- An OpenAI API compatible text to speech server using Coqui AI's xtts_v2 and/or piper tts as the backend.☆817Updated 8 months ago
- 8-bit CUDA functions for PyTorch☆63Updated 2 weeks ago
- RIFE, Real-Time Intermediate Flow Estimation for Video Frame Interpolation implemented with ncnn library☆54Updated 3 months ago
- Stable Diffusion and Flux in pure C/C++☆21Updated 3 weeks ago
- Stable Diffusion Docker image preconfigured for usage with AMD Radeon cards☆138Updated last year
- ☆474Updated last week
- RIFE filter for VapourSynth☆156Updated last month