Wheels for llama-cpp-python compiled with cuBLAS support
☆102Feb 1, 2024Updated 2 years ago
Alternatives and similar repositories for llama-cpp-python-cuBLAS-wheels
Users that are interested in llama-cpp-python-cuBLAS-wheels are comparing it to the libraries listed below
Sorting:
- A combination of Oobabooga's fork and the main cuda branch of GPTQ-for-LLaMa in a package format.☆23Oct 6, 2023Updated 2 years ago
- Wheels for llama-cpp-python compiled with cuBLAS support☆27Apr 9, 2025Updated 10 months ago
- 8-bit CUDA functions for PyTorch☆26Nov 18, 2023Updated 2 years ago
- A Next.js chatbot app demonstrating seamless integration with window.ai.☆15Jun 25, 2023Updated 2 years ago
- ☆13Oct 30, 2023Updated 2 years ago
- Simple extension for text-generation-webui that injects recent conversation history into the negative prompt with the goal of minimizing …☆32Nov 20, 2023Updated 2 years ago
- Dynamic parameter modulation for oobabooga's text-generation-webui that adjusts generation parameters to better mirror user affect.☆36Jul 28, 2023Updated 2 years ago
- Fast and memory-efficient exact attention - Windows wheels☆33Mar 3, 2024Updated last year
- ToolAgents is a lightweight and flexible framework for creating function-calling agents with various language models and APIs.☆27Dec 13, 2025Updated 2 months ago
- A Python library to split a Chinese Pinyin phrase into possible permutations of Chinese Pinyin words☆13Aug 10, 2021Updated 4 years ago
- ☆16Jun 6, 2023Updated 2 years ago
- Rivet plugin to access E2B goodies☆10Feb 6, 2025Updated last year
- Video Object Detection in Java with OpenCV + YOLO11 - full end-to-end tutorial☆46Nov 24, 2025Updated 3 months ago
- Data Analysis, Analytics, Science, AI & ML, LLM etc.☆15Jun 6, 2025Updated 8 months ago
- ☆12Nov 2, 2025Updated 3 months ago
- An embeddable widget for interacting with openAI api compatable LLM's☆14Sep 18, 2024Updated last year
- A script for merging a LLM model and a LoRA☆13Jun 22, 2023Updated 2 years ago
- ☆14Apr 5, 2023Updated 2 years ago
- An implementation of LLMzip using GPT-2☆13Aug 7, 2023Updated 2 years ago
- Web UI for ChatGPT, Google Gemini & Cloudflare Workers AI☆14Updated this week
- Science-driven chatbot development☆61May 5, 2024Updated last year
- animatediff prompt travel☆19Jan 27, 2024Updated 2 years ago
- ☆14Apr 2, 2024Updated last year
- Web UI for ExLlamaV2☆512Feb 5, 2025Updated last year
- A chatbot UI for RAG, multimodal, text completion. (support Transformers, llama.cpp, MLX, vLLM)☆20Apr 18, 2024Updated last year
- ☆11Updated this week
- Steer LLM outputs towards a certain topic/subject and enhance response capabilities using activation engineering by adding steering vecto…☆43Mar 21, 2024Updated last year
- Hugging Face model loaders☆20May 27, 2024Updated last year
- Advanced Coding AI Assistant that uses a Gradio interface to stream coding related responses. ChatRAG supports local and API inference an…☆23May 6, 2025Updated 9 months ago
- vllm混合推理扩展插件,支持多NUMA混合推理,单卡推理Qwen3-Next模型可达1000+ prefill☆31Nov 7, 2025Updated 3 months ago
- Terminal Voice Assistant is a powerful and flexible tool designed to help users interact with their terminal using natural language comma…☆19Jun 9, 2024Updated last year
- oobabooga extension - Experimental sampler to make LLMs more creative☆23Aug 2, 2023Updated 2 years ago
- Llama.cui is a small llama.cpp-based chat application for Node.js☆20Jul 10, 2025Updated 7 months ago
- extension for text WebUI☆19Aug 7, 2025Updated 6 months ago
- ☆21Dec 18, 2023Updated 2 years ago
- Ai generated music video with Riffusion and Gradio☆23Dec 16, 2022Updated 3 years ago
- Generative AI web UI and server☆22May 23, 2023Updated 2 years ago
- A simple batch file to make the oobabooga one click installer compatible with llama 4bit models and able to run on cuda☆21Mar 27, 2023Updated 2 years ago
- lightweight LAMA inference wrapper☆26Sep 28, 2023Updated 2 years ago