DeutscheKI / llm-performance-testsLinks

These are performance benchmarks we did to prepare for our own privacy-preserving and NDA-compliant in-house AI coding assistant. If by any chance, you're a German KMU, and you want strong in-house AI, too, feel free to contact us.

☆25

Alternatives and similar repositories for llm-performance-tests

Users that are interested in llm-performance-tests are comparing it to the libraries listed below

Sorting:

unslothai / llama.cpp
LLM inference in C/C++
☆77Updated this week
inferx-net / inferx
InferX is a Inference Function as a Service Platform
☆111Updated last week
bold84 / cot_proxy
Smart proxy for LLM APIs that enables model-specific parameter control, automatic mode switching (like Qwen3's /think and /no_think), and…
☆48Updated last month
mzbac / mlx_sharding
Distributed Inference for mlx LLm
☆93Updated 10 months ago
nath1295 / MLX-Textgen
A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.
☆86Updated this week
akx / ggify
Tool to download models from Huggingface Hub and convert them to GGML/GGUF for llama.cpp
☆152Updated last month
Cerebras / DocChat
GPT-4 Level Conversational QA Trained In a Few Hours
☆62Updated 10 months ago
AlexBodner / How_Much_VRAM
☆101Updated 9 months ago
teknium1 / ShareGPT-Builder
☆114Updated 6 months ago
remichu-ai / gallama
☆130Updated 2 months ago
transformerlab / transformerlab-api
API Server for Transformer Lab
☆66Updated this week
severian42 / Computational-Model-for-Symbolic-Representations
Glyphs, acting as collaboratively defined symbols linking related concepts, add a layer of multidimensional semantic richness to user-AI …
☆49Updated 4 months ago
lechmazur / step_game
Multi-Agent Step Race Benchmark: Assessing LLM Collaboration and Deception Under Pressure. A multi-player “step-race” that challenges LLM…
☆54Updated 2 weeks ago
ArturTanona / grpo_unsloth_docker
☆57Updated 4 months ago
lechmazur / confabulations
Hallucinations (Confabulations) Document-Based Benchmark for RAG. Includes human-verified questions and answers.
☆176Updated 2 weeks ago
lechmazur / pgg_bench
Public Goods Game (PGG) Benchmark: Contribute & Punish is a multi-agent benchmark that tests cooperative and self-interested strategies a…
☆36Updated 2 months ago
av / klmbr
klmbr - a prompt pre-processing technique to break through the barrier of entropy while generating text with LLMs
☆76Updated 9 months ago
chigkim / Ollama-MMLU-Pro
☆95Updated 6 months ago
d0rc / deepdive
Conduct in-depth research with AI-driven insights : DeepDive is a command-line tool that leverages web searches and AI models to generate…
☆42Updated 9 months ago
Mihaiii / backtrack_sampler
An easy-to-understand framework for LLM samplers that rewind and revise generated tokens
☆140Updated 4 months ago
perk11 / large-model-proxy
Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differe…
☆67Updated last week
prem-research / prem-operator
📡 Deploy AI models and apps to Kubernetes without developing a hernia
☆32Updated last year
fairydreaming / farel-bench
Testing LLM reasoning abilities with family relationship quizzes.
☆62Updated 4 months ago
groq / groqflow
GroqFlow provides an automated tool flow for compiling machine learning and linear algebra workloads into Groq programs and executing tho…
☆107Updated last month
argilla-io / argilla-cookbook
Simple examples using Argilla tools to build AI
☆53Updated 7 months ago
mzbac / mlx-moe
Scripts to create your own moe models using mlx
☆90Updated last year
lechmazur / generalization
Thematic Generalization Benchmark: measures how effectively various LLMs can infer a narrow or specific "theme" (category/rule) from a sm…
☆59Updated 2 weeks ago
abgulati / hf-waitress
Serving LLMs in the HF-Transformers format via a PyFlask API
☆71Updated 9 months ago
monk1337 / auto-ollama
run ollama & gguf easily with a single command
☆51Updated last year
apeatling / simple-guide-to-mlx-finetuning
Generate train.jsonl and valid.jsonl files to use for fine-tuning Mistral and other LLMs.
☆94Updated last year