cduk / vllm-pascalLinks
A fork of vLLM enabling Pascal architecture GPUs
☆28Updated 6 months ago
Alternatives and similar repositories for vllm-pascal
Users that are interested in vllm-pascal are comparing it to the libraries listed below
Sorting:
- Service for testing out the new Qwen2.5 omni model☆57Updated 4 months ago
- GPU Power and Performance Manager☆61Updated 10 months ago
- ☆96Updated last week
- An OpenAI API compatible API for chat with image input and questions about the images. aka Multimodal.☆259Updated 5 months ago
- Fully-featured, beautiful web interface for vLLM - built with NextJS.☆150Updated 3 months ago
- ☆132Updated 4 months ago
- Dataset Crafting w/ RAG/Wikipedia ground truth and Efficient Fine-Tuning Using MLX and Unsloth. Includes configurable dataset annotation …☆184Updated last year
- A multimodal, function calling powered LLM webui.☆216Updated 11 months ago
- A Conversational Speech Generation Model with Gradio UI and OpenAI compatible API. UI and API support CUDA, MLX and CPU devices.☆197Updated 3 months ago
- Inference service for Qwen2.5-VL-7b model☆194Updated 5 months ago
- Docker compose to run vLLM on Windows☆97Updated last year
- Comparison of the output quality of quantization methods, using Llama 3, transformers, GGUF, EXL2.☆162Updated last year
- automatically quant GGUF models☆196Updated last week
- A fast batching API to serve LLM models☆187Updated last year
- This is the Mixture-of-Agents (MoA) concept, adapted from the original work by TogetherAI. My version is tailored for local model usage a…☆118Updated last year
- This extension enhances the capabilities of textgen-webui by integrating advanced vision models, allowing users to have contextualized co…☆57Updated 10 months ago
- Speech-to-speech AI assistant with natural conversation flow, mid-speech interruption, vision capabilities and AI-initiated follow-ups. F…☆232Updated 4 months ago
- Local LLM Powered Recursive Search & Smart Knowledge Explorer☆251Updated 6 months ago
- Simple UI for Llama-3.2-11B-Vision & Molmo-7B-D☆137Updated 11 months ago
- KoboldCpp Smart Launcher with GPU Layer and Tensor Override Tuning☆27Updated 3 months ago
- ☆221Updated 3 months ago
- ☆50Updated 6 months ago
- Open source LLM UI, compatible with all local LLM providers.☆174Updated 11 months ago
- Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differe…☆73Updated last week
- Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.☆29Updated 7 months ago
- The Fastest Way to Fine-Tune LLMs Locally☆316Updated 5 months ago
- Easily view and modify JSON datasets for large language models☆81Updated 3 months ago
- Deploy Apollo HF space locally☆40Updated 8 months ago
- ☆209Updated last month
- 🗣️ Real‑time, low‑latency voice, vision, and conversational‑memory AI assistant built on LiveKit and local LLMs ✨☆89Updated 2 months ago