cduk / vllm-pascalLinks
A fork of vLLM enabling Pascal architecture GPUs
☆30Updated 9 months ago
Alternatives and similar repositories for vllm-pascal
Users that are interested in vllm-pascal are comparing it to the libraries listed below
Sorting:
- An OpenAI API compatible API for chat with image input and questions about the images. aka Multimodal.☆265Updated 8 months ago
- GPU Power and Performance Manager☆61Updated last year
- A fast batching API to serve LLM models☆188Updated last year
- ☆106Updated 3 months ago
- A multimodal, function calling powered LLM webui.☆216Updated last year
- Dataset Crafting w/ RAG/Wikipedia ground truth and Efficient Fine-Tuning Using MLX and Unsloth. Includes configurable dataset annotation …☆190Updated last year
- ☆226Updated 6 months ago
- ☆134Updated 6 months ago
- A Conversational Speech Generation Model with Gradio UI and OpenAI compatible API. UI and API support CUDA, MLX and CPU devices.☆208Updated 6 months ago
- Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.☆29Updated 10 months ago
- automatically quant GGUF models☆214Updated 3 weeks ago
- ☆208Updated 2 months ago
- Comparison of the output quality of quantization methods, using Llama 3, transformers, GGUF, EXL2.☆165Updated last year
- Your Trusty Memory-enabled AI Companion - Simple RAG chatbot optimized for local LLMs | 12 Languages Supported | OpenAI API Compatible☆342Updated 8 months ago
- A open webui function for better R1 experience☆77Updated 8 months ago
- ☆57Updated last year
- ☆51Updated 9 months ago
- An optimized quantization and inference library for running LLMs locally on modern consumer-class GPUs☆571Updated last week
- ☆20Updated last year
- A local AI companion that uses a collection of free, open source AI models in order to create two virtual companions that will follow you…☆236Updated last month
- 🗣️ Real‑time, low‑latency voice, vision, and conversational‑memory AI assistant built on LiveKit and local LLMs ✨☆97Updated 4 months ago
- KoboldCpp Smart Launcher with GPU Layer and Tensor Override Tuning☆29Updated 6 months ago
- Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differe…☆83Updated 3 weeks ago
- llama.cpp fork with additional SOTA quants and improved performance☆21Updated this week
- ☆83Updated 8 months ago
- ☆173Updated 3 months ago
- This small API downloads and exposes access to NeuML's txtai-wikipedia and full wikipedia datasets, taking in a query and returning full …☆101Updated 2 months ago
- Realtime tts reading of large textfiles by your favourite voice. +Translation via LLM (Python script)☆52Updated last year
- This extension enhances the capabilities of textgen-webui by integrating advanced vision models, allowing users to have contextualized co…☆57Updated last year
- Open source LLM UI, compatible with all local LLM providers.☆176Updated last year