Pre-built wheels for llama-cpp-python across platforms and CUDA versions
☆50Nov 9, 2025Updated 5 months ago
Alternatives and similar repositories for llama-cpp-python-wheels
Users that are interested in llama-cpp-python-wheels are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Huggingface Backup - Jupyter, Colab and Python Script☆10Jan 20, 2026Updated 2 months ago
- minimalistic AI library that resembles HF's transformers☆13Dec 31, 2024Updated last year
- Multi-turn dataset management tool for LLM trainers☆12Mar 31, 2025Updated last year
- danbooru的tag中文对照表☆21Mar 21, 2025Updated last year
- 基于edge-tts的简单语音合成服务,支持私有化部署,支持和源阅读APP无缝对接。☆20Aug 19, 2025Updated 7 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- 🤖 A multilingual translation tool that automatically converts Hugging Face's daily AI research papers into 🇯🇵 Japanese, 🇰🇷 Korean, �…☆18Apr 2, 2026Updated last week
- The web application for Texo, a minimalist SOTA LaTeX OCR model which contains only 20M parameters runs in browser. | 超轻量SOTA LaTeX公式识别模型…☆47Feb 23, 2026Updated last month
- A dedicated Colab notebooks to experiment (Nanonets OCR, Monkey OCR, OCRFlux 3B, Typhoo OCR 3B & more..) On T4 GPU - free tier☆23Feb 12, 2026Updated last month
- Quantized Attention that achieves speedups of 2.1-3.1x and 2.7-5.1x compared to FlashAttention2 and xformers, respectively, without lossi…☆131Mar 24, 2026Updated 2 weeks ago
- Gives each individual character their own memory.☆30Jun 1, 2025Updated 10 months ago
- ☆28Oct 24, 2025Updated 5 months ago
- Docker container for suno-ai bark model☆12Jun 26, 2023Updated 2 years ago
- “YOLOLite — lightweight YOLO in PyTorch. ONNX export + CPU inference (Raspberry Pi friendly).”☆59Feb 2, 2026Updated 2 months ago
- AI在线Tag选择器,基于开源项目改进,添加了一些新的功能☆29Mar 9, 2026Updated last month
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- [ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-t…☆58Jan 19, 2026Updated 2 months ago
- MediBeng Whisper Tiny improves doctor-patient transcription by training the Whisper Tiny model to translate mixed Bengali-English speech…☆29Jul 24, 2025Updated 8 months ago
- ☆28Feb 10, 2026Updated last month
- Simple customizable evaluation for text retrieval performance of Sentence Transformers embedders on PDFs☆30Jan 20, 2025Updated last year
- This project provides a production-ready, real-time inference server for LatentSync, enabling high-quality, low-latency 2D digital human …☆23Aug 16, 2025Updated 7 months ago
- Adetailer for sdxl diffusers pipeline.☆26Dec 16, 2024Updated last year
- LoRAMaster - LoRA训练大师,一个专注于LoRA训练的开源工具☆163Feb 11, 2026Updated last month
- Python tool designed to streamline the extraction of automatic captions from CapCut desktop☆15Feb 20, 2024Updated 2 years ago
- A database for modern, open-source TTS systems.☆33Feb 4, 2026Updated 2 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Your AI coworker for any folder: local-first, secure by design, cross-platform, and built for supervised automation.☆88Updated this week
- FSampler is a training‑free, sampler‑agnostic acceleration layer for diffusion sampling.☆128Feb 28, 2026Updated last month
- ☆20Nov 3, 2025Updated 5 months ago
- 代理Groq的API服务☆14May 25, 2024Updated last year
- Fork of SpargeAttention (SparseSageAttention) for Windows wheels and easy installation☆34Mar 24, 2026Updated 2 weeks ago
- Split long audio files based on subtitle-info in SRT File (Transcript saved in CSV)☆20Nov 14, 2019Updated 6 years ago
- OpenWrt PIA WireGuard Script☆19Mar 23, 2026Updated 2 weeks ago
- Wireguard config file generator for PIA VPN.☆23Dec 5, 2022Updated 3 years ago
- 为MNN安卓端适配兼容openai的API接口☆18Apr 21, 2025Updated 11 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- ☆24Feb 10, 2025Updated last year
- ☆18Dec 13, 2023Updated 2 years ago
- Adapt IPEX to CUDA☆42Mar 24, 2026Updated 2 weeks ago
- Personalize Anything for Free with Diffusion Transformer,use it in comfyUI with wrapper mode☆44Mar 26, 2025Updated last year
- ☆25Jan 25, 2026Updated 2 months ago
- A Model Context Protocol (MCP) server that provides hourly and daily weather forecasts using the AccuWeather API.☆32Sep 8, 2025Updated 7 months ago
- llama.cpp fork with additional SOTA quants and improved performance☆22Apr 2, 2026Updated last week