Inference server benchmarking tool
☆145Oct 2, 2025Updated 5 months ago
Alternatives and similar repositories for inference-benchmarker
Users that are interested in inference-benchmarker are comparing it to the libraries listed below
Sorting:
- Lightweight Python Wrapper for OpenVINO, enabling LLM inference on NPUs☆27Dec 17, 2024Updated last year
- Minimal implementation of a Byte Pair Encoding (BPE) tokenizer in Zig☆14Apr 7, 2025Updated 10 months ago
- An OpenAI API compatible images server to generate or manipulate images.☆17Feb 2, 2025Updated last year
- Query, ask and chat with a document-index via transformer models!☆17Jun 22, 2023Updated 2 years ago
- ☆26Nov 18, 2025Updated 3 months ago
- Personal voice assistant, with voice interruption and Twilio support☆18Feb 24, 2025Updated last year
- ☆17Dec 16, 2024Updated last year
- Building synthetic data for preference tuning☆27Dec 26, 2024Updated last year
- All-in-one UI for merged LLMs in Hugging Face☆25Jun 10, 2024Updated last year
- KoboldCpp Smart Launcher with GPU Layer and Tensor Override Tuning☆30May 18, 2025Updated 9 months ago
- Quantized text-audio foundation model from Boson AI☆43Aug 13, 2025Updated 6 months ago
- AgentKraft: A simple platform for building and deploying conversational AI agents☆25Apr 29, 2025Updated 10 months ago
- Run Orpheus 3B Locally with Gradio UI, Standalone App☆23Apr 1, 2025Updated 11 months ago
- Prometheus exporter for Linux based GDDR6/GDDR6X VRAM and GPU Core Hot spot temperature reader for NVIDIA RTX 3000/4000 series GPUs.☆24Oct 2, 2024Updated last year
- A simple no-install web UI for Ollama and OAI-Compatible APIs!☆31Jan 30, 2025Updated last year
- Leveraging LLMs for modernization through intelligent chunking, iterative prompting and reflection, and retrieval augmented generation (R…☆39Feb 17, 2026Updated 2 weeks ago
- A unified library for interacting with various AI APIs through a standardized interface.☆35Mar 13, 2025Updated 11 months ago
- A Framework for Narrative Agents☆37Updated this week
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.☆32Sep 19, 2025Updated 5 months ago
- QJL: 1-Bit Quantized JL transform for KV Cache Quantization with Zero Overhead☆32Jan 27, 2025Updated last year
- Page-wise text recognition with lower-supervision line data models☆51Feb 27, 2026Updated last week
- Official Repo for "SplitQuant / LLM-PQ: Resource-Efficient LLM Offline Serving on Heterogeneous GPUs via Phase-Aware Model Partition and …☆36Aug 29, 2025Updated 6 months ago
- MrlX: A Multi-Agent Reinforcement Learning Framework☆193Jan 19, 2026Updated last month
- Text to audio with Tik-Tok Voices☆13Apr 6, 2023Updated 2 years ago
- This is a mirror of the sourceforge TimeTrex repo☆10Jan 8, 2023Updated 3 years ago
- A template code for running modular and reproducible experiments in pytorch☆13Sep 3, 2025Updated 6 months ago
- Machine Learning project to identify Japanese characters (hiragana) from a data set.☆14Nov 24, 2018Updated 7 years ago
- FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.☆1,025Sep 4, 2024Updated last year
- NexRL is an ultra-loosely-coupled LLM post-training framework.☆101Feb 28, 2026Updated last week
- Own your AI, search the web with it🌐😎☆93Jan 14, 2025Updated last year
- A collection of practical, end-to-end AI application examples accelerated by MemryX hardware and software solutions. This repository off…☆91Updated this week
- Find or build all reverse dependencies of a Haskell package using Nix☆14Jul 26, 2020Updated 5 years ago
- Voltalis to Home Assistant bridge☆10Oct 5, 2025Updated 5 months ago
- A higher quality RVC pretrained model to accelerate your training process.☆21Nov 11, 2025Updated 3 months ago
- Wagtail public roadmap☆12Feb 27, 2026Updated last week
- Simple and powerful extension for searching web and viewing website content.☆11Apr 11, 2025Updated 10 months ago
- Home server set up☆13Oct 5, 2025Updated 5 months ago
- All-in-one environment to use Dria, the collective knowledge for AI.☆14Mar 15, 2024Updated last year
- Transplants vocabulary between language models, enabling the creation of draft models for speculative decoding WITHOUT retraining.☆49Oct 29, 2025Updated 4 months ago