Pre-built wheels for llama-cpp-python across platforms and CUDA versions
☆62Apr 18, 2026Updated last month
Alternatives and similar repositories for llama-cpp-python-wheels
Users that are interested in llama-cpp-python-wheels are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- RF-DETR + USLS: object detection using Rust☆15Apr 12, 2025Updated last year
- minimalistic AI library that resembles HF's transformers☆13Dec 31, 2024Updated last year
- Multi-turn dataset management tool for LLM trainers☆12Mar 31, 2025Updated last year
- 🤖 A multilingual translation tool that automatically converts Hugging Face's daily AI research papers into 🇯🇵 Japanese, 🇰🇷 Korean, �…☆18May 13, 2026Updated last week
- The web application for Texo, a minimalist SOTA LaTeX OCR model which contains only 20M parameters runs in browser. | 超轻量SOTA LaTeX公式识别模型…☆53Feb 23, 2026Updated 2 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- A dedicated Colab notebooks to experiment (Nanonets OCR, Monkey OCR, OCRFlux 3B, Typhoo OCR 3B & more..) On T4 GPU - free tier☆23Feb 12, 2026Updated 3 months ago
- Gives each individual character their own memory.☆30Jun 1, 2025Updated 11 months ago
- ☆25Oct 24, 2025Updated 6 months ago
- Quantized Attention that achieves speedups of 2.1-3.1x and 2.7-5.1x compared to FlashAttention2 and xformers, respectively, without lossi…☆139Mar 24, 2026Updated last month
- MediBeng Whisper Tiny improves doctor-patient transcription by training the Whisper Tiny model to translate mixed Bengali-English speech…☆29Jul 24, 2025Updated 9 months ago
- ☆28Feb 10, 2026Updated 3 months ago
- “YOLOLite — lightweight YOLO in PyTorch. ONNX export + CPU inference (Raspberry Pi friendly).”☆65Apr 14, 2026Updated last month
- Simple customizable evaluation for text retrieval performance of Sentence Transformers embedders on PDFs☆30Jan 20, 2025Updated last year
- Adetailer for sdxl diffusers pipeline.☆26Dec 16, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- LoRAMaster - LoRA训练大师,一个专注于LoRA训练的开源工具☆165Feb 11, 2026Updated 3 months ago
- A database for modern, open-source TTS systems.☆32Feb 4, 2026Updated 3 months ago
- This project provides a production-ready, real-time inference server for LatentSync, enabling high-quality, low-latency 2D digital human …☆24Aug 16, 2025Updated 9 months ago
- a python package for loadimg and converting images☆30Feb 18, 2026Updated 3 months ago
- FSampler is a training‑free, sampler‑agnostic acceleration layer for diffusion sampling.☆129Feb 28, 2026Updated 2 months ago
- Your AI coworker for any folder: local-first, secure by design, cross-platform, and built for supervised automation.☆100Updated this week
- 代理Groq的API服务☆13May 25, 2024Updated last year
- Fork of SpargeAttention (SparseSageAttention) for Windows wheels and easy installation☆36Mar 24, 2026Updated last month
- OpenWrt PIA WireGuard Script☆19Apr 13, 2026Updated last month
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Wireguard config file generator for PIA VPN.☆23Dec 5, 2022Updated 3 years ago
- ☆19Dec 13, 2023Updated 2 years ago
- [NeurIPS 2025 Spotlight] LeMiCa: Lexicographic Minimax Path Caching for Efficient Diffusion-Based Video Generation☆115Apr 28, 2026Updated 3 weeks ago
- (整合包Integrated package)一键使用面壁智能最新的MiniCPM-o 2.6多模态模型,用于视频对话、语音对话和文字对话。|Use Modelbest's latest MiniCPM-o 2.6 multi-modal model with one c…☆15Jul 13, 2025Updated 10 months ago
- Adapt IPEX to CUDA☆44Updated this week
- ☆27Jan 25, 2026Updated 3 months ago
- A Model Context Protocol (MCP) server that provides hourly and daily weather forecasts using the AccuWeather API.☆31Sep 8, 2025Updated 8 months ago
- Personalize Anything for Free with Diffusion Transformer,use it in comfyUI with wrapper mode☆44Mar 26, 2025Updated last year
- ☆52Feb 19, 2025Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- An Extensive AI & Camera Metadata Viewer☆68Apr 14, 2026Updated last month
- llama.cpp fork with additional SOTA quants and improved performance☆22May 13, 2026Updated last week
- A Chrome extension that makes long ChatGPT conversations fast again by virtualizing off-screen messages without losing any context.☆131Feb 22, 2026Updated 2 months ago
- A library for working with prompt templates locally or on the Hugging Face Hub.☆56Mar 5, 2025Updated last year
- A highly optimized engine for neutts-air model to generate minutes of audio in seconds. Over 200x realtime on modern hardware!☆119Nov 24, 2025Updated 5 months ago
- An Alpine Linux docker container running Privoxy and OpenVPN via Private Internet Access☆20Apr 13, 2026Updated last month
- Cross-platform installer for Triton and SageAttention on ComfyUI. Simplifies GPU-accelerated inference setup for Windows users with autom…☆126Mar 31, 2026Updated last month