cduk / vllm-pascalLinks
A fork of vLLM enabling Pascal architecture GPUs
☆28Updated 4 months ago
Alternatives and similar repositories for vllm-pascal
Users that are interested in vllm-pascal are comparing it to the libraries listed below
Sorting:
- Fully-featured, beautiful web interface for vLLM - built with NextJS.☆146Updated 2 months ago
- A multimodal, function calling powered LLM webui.☆214Updated 9 months ago
- An OpenAI API compatible API for chat with image input and questions about the images. aka Multimodal.☆257Updated 4 months ago
- A fast batching API to serve LLM models☆183Updated last year
- Comparison of the output quality of quantization methods, using Llama 3, transformers, GGUF, EXL2.☆156Updated last year
- Dataset Crafting w/ RAG/Wikipedia ground truth and Efficient Fine-Tuning Using MLX and Unsloth. Includes configurable dataset annotation …☆185Updated 11 months ago
- This extension enhances the capabilities of textgen-webui by integrating advanced vision models, allowing users to have contextualized co…☆56Updated 8 months ago
- ☆95Updated 6 months ago
- This project demonstrates a basic chain-of-thought interaction with any LLM (Large Language Model)☆321Updated 9 months ago
- automatically quant GGUF models☆187Updated this week
- Speech-to-speech AI assistant with natural conversation flow, mid-speech interruption, vision capabilities and AI-initiated follow-ups. F…☆188Updated 3 months ago
- Inference service for Qwen2.5-VL-7b model☆188Updated 3 months ago
- ☆49Updated 4 months ago
- This is the Mixture-of-Agents (MoA) concept, adapted from the original work by TogetherAI. My version is tailored for local model usage a…☆117Updated last year
- Your Trusty Memory-enabled AI Companion - Simple RAG chatbot optimized for local LLMs | 12 Languages Supported | OpenAI API Compatible☆319Updated 4 months ago
- Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.☆28Updated 5 months ago
- Docker compose to run vLLM on Windows☆92Updated last year
- ☆206Updated 2 months ago
- Service for testing out the new Qwen2.5 omni model☆54Updated 2 months ago
- A Conversational Speech Generation Model with Gradio UI and OpenAI compatible API. UI and API support CUDA, MLX and CPU devices.☆192Updated 2 months ago
- ☆204Updated last month
- ☆131Updated 2 months ago
- A local AI companion that uses a collection of free, open source AI models in order to create two virtual companions that will follow you…☆222Updated last month
- Open source LLM UI, compatible with all local LLM providers.☆175Updated 9 months ago
- ☆107Updated 2 months ago
- Local LLM Powered Recursive Search & Smart Knowledge Explorer☆244Updated 5 months ago
- GPU Power and Performance Manager☆60Updated 9 months ago
- Code execution utilities for Open WebUI & Ollama☆290Updated 8 months ago
- A library and CLI utilities for managing performance states of NVIDIA GPUs.☆27Updated 9 months ago
- An optimized quantization and inference library for running LLMs locally on modern consumer-class GPUs☆436Updated this week