DeutscheKI / llm-performance-testsLinks
These are performance benchmarks we did to prepare for our own privacy-preserving and NDA-compliant in-house AI coding assistant. If by any chance, you're a German KMU, and you want strong in-house AI, too, feel free to contact us.
☆25Updated 2 months ago
Alternatives and similar repositories for llm-performance-tests
Users that are interested in llm-performance-tests are comparing it to the libraries listed below
Sorting:
- LLM inference in C/C++☆77Updated this week
- InferX is a Inference Function as a Service Platform☆111Updated last week
- Smart proxy for LLM APIs that enables model-specific parameter control, automatic mode switching (like Qwen3's /think and /no_think), and…☆48Updated last month
- Distributed Inference for mlx LLm☆93Updated 10 months ago
- A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.☆86Updated this week
- Tool to download models from Huggingface Hub and convert them to GGML/GGUF for llama.cpp☆152Updated last month
- GPT-4 Level Conversational QA Trained In a Few Hours☆62Updated 10 months ago
- ☆101Updated 9 months ago
- ☆114Updated 6 months ago
- ☆130Updated 2 months ago
- API Server for Transformer Lab☆66Updated this week
- Glyphs, acting as collaboratively defined symbols linking related concepts, add a layer of multidimensional semantic richness to user-AI …☆49Updated 4 months ago
- Multi-Agent Step Race Benchmark: Assessing LLM Collaboration and Deception Under Pressure. A multi-player “step-race” that challenges LLM…☆54Updated 2 weeks ago
- ☆57Updated 4 months ago
- Hallucinations (Confabulations) Document-Based Benchmark for RAG. Includes human-verified questions and answers.☆176Updated 2 weeks ago
- Public Goods Game (PGG) Benchmark: Contribute & Punish is a multi-agent benchmark that tests cooperative and self-interested strategies a…☆36Updated 2 months ago
- klmbr - a prompt pre-processing technique to break through the barrier of entropy while generating text with LLMs☆76Updated 9 months ago
- ☆95Updated 6 months ago
- Conduct in-depth research with AI-driven insights : DeepDive is a command-line tool that leverages web searches and AI models to generate…☆42Updated 9 months ago
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆140Updated 4 months ago
- Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differe…☆67Updated last week
- 📡 Deploy AI models and apps to Kubernetes without developing a hernia☆32Updated last year
- Testing LLM reasoning abilities with family relationship quizzes.☆62Updated 4 months ago
- GroqFlow provides an automated tool flow for compiling machine learning and linear algebra workloads into Groq programs and executing tho…☆107Updated last month
- Simple examples using Argilla tools to build AI☆53Updated 7 months ago
- Scripts to create your own moe models using mlx☆90Updated last year
- Thematic Generalization Benchmark: measures how effectively various LLMs can infer a narrow or specific "theme" (category/rule) from a sm…☆59Updated 2 weeks ago
- Serving LLMs in the HF-Transformers format via a PyFlask API☆71Updated 9 months ago
- run ollama & gguf easily with a single command☆51Updated last year
- Generate train.jsonl and valid.jsonl files to use for fine-tuning Mistral and other LLMs.☆94Updated last year