coder543 / llm-speed-benchmarkLinks
A tool that can be used to measure the sequential performance of any OpenAI-compatible LLM API
☆22Updated last year
Alternatives and similar repositories for llm-speed-benchmark
Users that are interested in llm-speed-benchmark are comparing it to the libraries listed below
Sorting:
- ☆17Updated last year
- ☆134Updated 2 months ago
- The hearth of The Pulsar App, fast, secure and shared inference with modern UI☆59Updated last year
- Dagger functions to import Hugging Face GGUF models into a local ollama instance and optionally push them to ollama.com.☆119Updated last year
- ☆51Updated last year
- 🚀 Scale your RAG pipeline using Ragswift: A scalable centralized embeddings management platform☆38Updated 2 years ago
- This is the Mixture-of-Agents (MoA) concept, adapted from the original work by TogetherAI. My version is tailored for local model usage a…☆118Updated last year
- an auto-sleeping and -waking framework around llama.cpp☆12Updated last year
- Serving LLMs in the HF-Transformers format via a PyFlask API☆72Updated last year
- ☆24Updated last year
- run ollama & gguf easily with a single command☆52Updated last year
- Adding a multi-text multi-speaker script (diffe) that is based on a script from asiff00 on issue 61 for Sesame: A Conversational Speech G…☆26Updated 10 months ago
- ☆23Updated 2 months ago
- Efficient computer use agent powered by Meta Llama 4 Maverick☆46Updated 9 months ago
- A simple experiment on letting two local LLM have a conversation about anything!☆112Updated last year
- Super simple python connectors for llama.cpp, including vision models (Gemma 3, Qwen2-VL). Compile llama.cpp and run!☆29Updated 2 months ago
- entropix style sampling + GUI☆27Updated last year
- Open source implementation for computer use, using light OCR models and LLMs. Get Android app in link below.☆30Updated this week
- Glyphs, acting as collaboratively defined symbols linking related concepts, add a layer of multidimensional semantic richness to user-AI …☆56Updated last year
- Easily view and modify JSON datasets for large language models☆87Updated 8 months ago
- GPT-4 Level Conversational QA Trained In a Few Hours☆65Updated last year
- ☆32Updated last year
- Tcurtsni: Reverse Instruction Chat, ever wonder what your LLM wants to ask you?☆23Updated last year
- A stable, fast and easy-to-use inference library with a focus on a sync-to-async API☆47Updated last year
- ☆109Updated 5 months ago
- A quick and optimized solution to manage llama based gguf quantized models, download gguf files, retreive messege formatting, add more mo…☆12Updated 2 years ago
- Embed anything.☆27Updated last year
- Auto Data is a library designed for quick and effortless creation of datasets tailored for fine-tuning Large Language Models (LLMs).☆105Updated last year
- Forces DeepSeek R1 models to engage in extended reasoning by intercepting early termination tokens.☆19Updated last year
- LLM-Training-API: Including Embeddings & ReRankers, mergekit, LaserRMT☆27Updated last year