sasha0552 / vllm-ciView external linksLinks
CI scripts designed to build a Pascal-compatible version of vLLM.
☆12Aug 10, 2024Updated last year
Alternatives and similar repositories for vllm-ci
Users that are interested in vllm-ci are comparing it to the libraries listed below
Sorting:
- A library and CLI utilities for managing performance states of NVIDIA GPUs.☆33Oct 6, 2024Updated last year
- Structured TRIZ prompt engineering for LLMs in an open, portable XML format – MIT licensed.☆14Nov 11, 2025Updated 3 months ago
- Code for paper: "QuIP: 2-Bit Quantization of Large Language Models With Guarantees" adapted for Llama models☆41Aug 4, 2023Updated 2 years ago
- ☆11Jan 19, 2024Updated 2 years ago
- Modified Beam Search with periodical restart☆12Sep 12, 2024Updated last year
- ☆11Jan 28, 2024Updated 2 years ago
- The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.☆13Mar 30, 2024Updated last year
- Oobabooga "Hello World" API example for node.js with Express☆13Jul 2, 2023Updated 2 years ago
- ☆14Aug 25, 2024Updated last year
- An OpenAI API compatible images server to generate or manipulate images.☆17Feb 2, 2025Updated last year
- A Local Proxy and Compatibility Layer for LLM Services☆11May 2, 2024Updated last year
- Let's have some retro gaming fun with AI! Join the discord: https://discord.gg/5xXzkMu8Zk☆62Nov 19, 2025Updated 2 months ago
- LLM backed Fantasy Tribe Game☆19Nov 21, 2024Updated last year
- 🎬 Powerful, high-performance animations for PixiJS☆20Nov 19, 2025Updated 2 months ago
- Reproducing GPT on the TinyStories dataset☆19Jan 18, 2024Updated 2 years ago
- A tool that can be used to measure the sequential performance of any OpenAI-compatible LLM API☆22Aug 1, 2024Updated last year
- Implementation of stop sequencer for Huggingface Transformers☆16Jun 6, 2023Updated 2 years ago
- Friendly Terminal Assistant for Developers☆17Mar 23, 2024Updated last year
- ☆19Jun 5, 2023Updated 2 years ago
- ☆17Dec 16, 2024Updated last year
- ☆17Dec 18, 2023Updated 2 years ago
- Multi-agent autonomous research system using LangGraph and LangChain. Generates citation-backed reports with credibility scoring and web …☆139Jan 1, 2026Updated last month
- A guide to testing different runpod (and other linux VMs) configurations. Specifically the speed of LLM outputs☆17Jan 12, 2024Updated 2 years ago
- Benchmarking tool for vLLM inference performance with GPU monitoring☆40Nov 24, 2025Updated 2 months ago
- Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differe…☆88Feb 7, 2026Updated last week
- ☆16Jul 13, 2024Updated last year
- Proteus is an experimental platform that combines the power of Large Language Models with the Genesis physics engine☆25Dec 20, 2024Updated last year
- The easiest & fastest way to run customized and fine-tuned LLMs locally or on the edge☆24Mar 5, 2025Updated 11 months ago
- ☆24Mar 10, 2025Updated 11 months ago
- KoboldCpp Smart Launcher with GPU Layer and Tensor Override Tuning☆30May 18, 2025Updated 8 months ago
- 5X faster 60% less memory QLoRA finetuning☆21May 28, 2024Updated last year
- A daemon that automatically manages the performance states of NVIDIA GPUs.☆112Nov 2, 2025Updated 3 months ago
- software for mini keyboards☆35Jun 17, 2023Updated 2 years ago
- Demo of an "always-on" AI assistant.☆24Feb 14, 2024Updated 2 years ago
- Prometheus exporter for Linux based GDDR6/GDDR6X VRAM and GPU Core Hot spot temperature reader for NVIDIA RTX 3000/4000 series GPUs.☆24Oct 2, 2024Updated last year
- Stable Diffusion and Flux in pure C/C++☆24Feb 7, 2026Updated last week
- [EACL'23] MCoNaLa: A Benchmark for Code Generation from Multiple Natural Languages☆23Feb 13, 2023Updated 3 years ago
- Ampere optimized llama.cpp☆32Jan 30, 2026Updated 2 weeks ago
- Discord chatbot interface to train an LLM on user message history☆27Jun 9, 2023Updated 2 years ago