arlo-phoenix / CTranslate2-rocmLinks
Fast inference engine for Transformer models
☆56Updated last year
Alternatives and similar repositories for CTranslate2-rocm
Users that are interested in CTranslate2-rocm are comparing it to the libraries listed below
Sorting:
- ☆428Updated 10 months ago
- AMD (Radeon GPU) ROCm based setup for popular AI tools on Ubuntu 24.04.1☆217Updated last week
- A complete package that provides you with all the components needed to get started of dive deeper into Machine Learning Workloads on Cons…☆50Updated this week
- An optimized quantization and inference library for running LLMs locally on modern consumer-class GPUs☆626Updated 2 weeks ago
- A daemon that automatically manages the performance states of NVIDIA GPUs.☆112Updated 3 months ago
- The HIP Environment and ROCm Kit - A lightweight open source build system for HIP and ROCm☆770Updated this week
- ☆90Updated 2 months ago
- Whisper command line client compatible with original OpenAI client based on CTranslate2.☆1,207Updated this week
- Inference engine for Intel devices. Serve LLMs, VLMs, Whisper, Kokoro-TTS, Embedding and Rerank models over OpenAI endpoints.☆295Updated this week
- ROCm docker images with fixes/support for extra architectures, such as gfx803/gfx1010.☆31Updated 2 years ago
- The official API server for Exllama. OAI compatible, lightweight, and fast.☆1,129Updated this week
- Stable Diffusion and Flux in pure C/C++☆24Updated last week
- 8-bit CUDA functions for PyTorch☆70Updated 4 months ago
- Fresh builds of llama.cpp with AMD ROCm™ 7 acceleration☆187Updated this week
- llama.cpp fork with additional SOTA quants and improved performance☆1,605Updated this week
- Input text from speech in any Linux window, the lean, fast and accurate way, using whisper.cpp OFFLINE. Speak with local LLMs via llama.c…☆167Updated 6 months ago
- Fork of ollama for vulkan support☆109Updated 11 months ago
- ☆238Updated 2 years ago
- The main repository for building Pascal-compatible versions of ML applications and libraries.☆169Updated 5 months ago
- Croco.Cpp is fork of KoboldCPP infering GGML/GGUF models on CPU/Cuda with KoboldAI's UI. It's powered partly by IK_LLama.cpp, and compati…☆156Updated this week
- Simple monkeypatch to boost AMD Navi 3 GPUs☆47Updated 9 months ago
- Prometheus exporter for Linux based GDDR6/GDDR6X VRAM and GPU Core Hot spot temperature reader for NVIDIA RTX 3000/4000 series GPUs.☆24Updated last year
- Reliable model swapping for any local OpenAI/Anthropic compatible server - llama.cpp, vllm, etc☆2,374Updated this week
- Easy to use interface for the Whisper model optimized for all GPUs!☆463Updated last month
- An OpenAI API compatible text to speech server using Coqui AI's xtts_v2 and/or piper tts as the backend.☆850Updated last year
- The Scalable Video Technology for AV1 (SVT-AV1 Encoder and Decoder) with perceptual enhancements for psychovisually optimal AV1 encoding☆399Updated 9 months ago
- Run LLMs on AMD Ryzen™ AI NPUs in minutes. Just like Ollama - but purpose-built and deeply optimized for the AMD NPUs.☆707Updated this week
- Core, Junction, and VRAM temperature reader for Linux + GDDR6/GDDR6X GPUs☆67Updated 3 months ago
- An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting Multiple Inference Engine☆540Updated last year
- llama.cpp-gfx906☆90Updated this week