cpldcpu / llmbenchmark
Various LLM Benchmarks
☆19Updated 2 weeks ago
Alternatives and similar repositories for llmbenchmark:
Users that are interested in llmbenchmark are comparing it to the libraries listed below
- Try out HallOumi, a state-of-the-art claim verification model in a simple UI!☆30Updated 3 weeks ago
- Public Goods Game (PGG) Benchmark: Contribute & Punish is a multi-agent benchmark that tests cooperative and self-interested strategies a…☆35Updated 2 weeks ago
- EdgeInfer enables efficient edge intelligence by running small AI models, including embeddings and OnnxModels, on resource-constrained de…☆44Updated last year
- The hearth of The Pulsar App, fast, secure and shared inference with modern UI☆56Updated 4 months ago
- Self-hosted LLM chatbot arena, with yourself as the only judge☆39Updated last year
- Query Expension for Better Query Embedding using LLMs☆47Updated 2 months ago
- ☆85Updated last month
- GGML implementation of BERT model with Python bindings and quantization.☆56Updated last year
- Trying to deconstruct RWKV in understandable terms☆14Updated last year
- A super simple web interface to perform blind tests on LLM outputs.☆28Updated last year
- ☆24Updated 3 months ago
- Yet Another (LLM) Web UI, made with Gemini☆11Updated 4 months ago
- The DPAB-α Benchmark☆20Updated 3 months ago
- MCP server for connecting agentic systems to search systems via searXNG☆62Updated 2 months ago
- ☆44Updated last week
- Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differe…☆56Updated 2 months ago
- ☆53Updated 10 months ago
- ☆57Updated 2 months ago
- A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.☆77Updated 4 months ago
- Convert downloaded Ollama models back into their GGUF equivalent format☆30Updated 4 months ago
- ☆130Updated 5 months ago
- Workflow Defined Engine☆23Updated last week
- Forces DeepSeek R1 models to engage in extended reasoning by intercepting early termination tokens.☆19Updated 2 months ago
- ☆22Updated 2 months ago
- LLM inference in C/C++☆71Updated this week
- Very minimal (and stateless) agent framework☆42Updated 3 months ago
- Thematic Generalization Benchmark: measures how effectively various LLMs can infer a narrow or specific "theme" (category/rule) from a sm…☆44Updated last week
- OpenPipe Reinforcement Learning Experiments☆22Updated last month
- ☆12Updated 7 months ago
- Ollamadore 64 is a private ultra lightweight frontend for Ollama that weighs well under 64 kilobytes on disk.☆42Updated last month