cpldcpu / llmbenchmarkLinks

Various LLM Benchmarks

☆24

Alternatives and similar repositories for llmbenchmark

Users that are interested in llmbenchmark are comparing it to the libraries listed below

Sorting:

AaronFeng753 / Better-Qwen3
Auto Thinking Mode switch for Qwen3 in Open webui
☆66Updated 2 months ago
oumi-ai / halloumi-demo
Try out HallOumi, a state-of-the-art claim verification model in a simple UI!
☆37Updated 3 months ago
woct0rdho / transformers-qwen3-moe-fused
Fused Qwen3 MoE layer for faster training, compatible with HF Transformers, LoRA, 4-bit quant, Unsloth
☆122Updated this week
lechmazur / pgg_bench
Public Goods Game (PGG) Benchmark: Contribute & Punish is a multi-agent benchmark that tests cooperative and self-interested strategies a…
☆37Updated 3 months ago
and270 / thinking_effort_processor
☆90Updated last week
EQ-bench / eqbench3
☆14Updated 2 months ago
unit-mesh / edge-infer
EdgeInfer enables efficient edge intelligence by running small AI models, including embeddings and OnnxModels, on resource-constrained de…
☆45Updated last year
astramind-ai / Pulsar
The hearth of The Pulsar App, fast, secure and shared inference with modern UI
☆57Updated 7 months ago
unslothai / llama.cpp
LLM inference in C/C++
☆94Updated this week
Artefact2 / llm-eval
A super simple web interface to perform blind tests on LLM outputs.
☆28Updated last year
BlinkDL / fast.c
Prepare for DeekSeek R1 inference: Benchmark CPU, DRAM, SSD, iGPU, GPU, ... with efficient code.
☆72Updated 5 months ago
uukuguy / speechless
LLM based agents with proactive interactions, long-term memory, external tool integration, and local deployment capabilities.
☆104Updated this week
gvlassis / gvtop
🎮 Material You TUI for monitoring NVIDIA GPUs
☆50Updated last month
BeautyyuYanli / tooluser
Enable tool-use ability for any LLM model (DeepSeek V3/R1, etc.)
☆52Updated last month
sambanova / agents
☆50Updated this week
Jellyfish042 / Sudoku-RWKV
☆142Updated 7 months ago
tiiuae / onebitllms
Lightweight toolkit package to train and fine-tune 1.58bit Language models
☆80Updated last month
TechxGenus / CursorCore
CursorCore: Assist Programming through Aligning Anything
☆127Updated 5 months ago
femto / minion
👷‍♂️Minion is Agent's Brain. Minion is designed to execute any type of queries, offering a variety of features that demonstrate its flex…
☆23Updated this week
iamlemec / bert.cpp
GGML implementation of BERT model with Python bindings and quantization.
☆56Updated last year
nyunAI / PruneGPT
☆52Updated last year
willkurt / token-explorer
A simple tool that let's you explore different possible paths that an LLM might sample.
☆175Updated 2 months ago
lechmazur / nyt-connections
Benchmark that evaluates LLMs using 651 NYT Connections puzzles extended with extra trick words
☆130Updated this week
karminski / streaming-json-py
A streamlined, user-friendly JSON streaming preprocessor, crafted in Python.
☆102Updated 9 months ago
Codys12 / airllm
AirLLM 70B inference with single 4GB GPU
☆14Updated 3 weeks ago
pnmartinez / simple-computer-use
Open source implementation for computer use, using light OCR models and LLMs. Get Android app in link below.
☆26Updated 2 weeks ago
codelion / pts
Pivotal Token Search
☆109Updated last week
severian42 / Computational-Model-for-Symbolic-Representations
Glyphs, acting as collaboratively defined symbols linking related concepts, add a layer of multidimensional semantic richness to user-AI …
☆49Updated 5 months ago
lechmazur / generalization
Thematic Generalization Benchmark: measures how effectively various LLMs can infer a narrow or specific "theme" (category/rule) from a sm…
☆60Updated this week
firstbatchxyz / function-calling-eval
The DPAB-α Benchmark
☆28Updated 6 months ago