☆136May 2, 2025Updated last year
Alternatives and similar repositories for SOLOBench
Users that are interested in SOLOBench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Thematic Generalization Benchmark: measures how effectively various LLMs can infer a narrow or specific "theme" (category/rule) from a sm…☆71Apr 16, 2026Updated last month
- ☆27Jun 11, 2025Updated last year
- ☆23Sep 27, 2024Updated last year
- Tools for formatting large language model prompts.☆13Dec 19, 2023Updated 2 years ago
- A benchmark for emotional intelligence in large language models☆430Jul 26, 2024Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Chat WebUI is an easy-to-use user interface for interacting with AI, and it comes with multiple useful built-in tools such as web search …☆52Feb 10, 2026Updated 4 months ago
- ☆19Jul 4, 2025Updated 11 months ago
- Real-time webcam demo with SmolVLM(mlx-community/SmolVLM-Instruct-4bit) and MLX-VLM☆28Jun 12, 2025Updated last year
- Information Processing Evaluation for Large Language Models☆55Apr 24, 2026Updated last month
- SLOP Detector and analyzer based on dictionary for shareGPT JSON and text☆100Apr 2, 2026Updated 2 months ago
- Cognito: Supercharge your Chrome browser with AI. Guide, query, and control everything using natural language.☆57Jan 11, 2026Updated 5 months ago
- ☆16Feb 21, 2026Updated 3 months ago
- V.I.S.O.R., my in-development AI-powered voice assistant with integrated memory!☆36Nov 20, 2025Updated 6 months ago
- This is the Mixture-of-Agents (MoA) concept, adapted from the original work by TogetherAI. My version is tailored for local model usage a…☆12Jun 25, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Multi-Agent Step Race Benchmark: Assessing LLM Collaboration and Deception Under Pressure. A multi-player “step-race” that challenges LLM…☆87Dec 9, 2025Updated 6 months ago
- ☆57Feb 18, 2025Updated last year
- Hallucinations (Confabulations) Document-Based Benchmark for RAG. Includes human-verified questions and answers.☆246Aug 7, 2025Updated 10 months ago
- Preprint: Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning☆27Jan 26, 2024Updated 2 years ago
- ☆336Nov 1, 2025Updated 7 months ago
- Approximating the joint distribution of language models via MCTS☆22Nov 3, 2024Updated last year
- SVGBench: A challenging LLM benchmark that tests knowledge, coding, physical reasoning capabilities of LLMs.☆69Feb 12, 2026Updated 4 months ago
- Try out HallOumi, a state-of-the-art claim verification model in a simple UI!☆41Apr 2, 2025Updated last year
- Tiny Llama model trained to play chess☆30Jul 22, 2025Updated 10 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- A fairly lightweight daemon that keeps your computer awake. Designed for rootless environments.☆26May 3, 2019Updated 7 years ago
- ☆81Jun 20, 2025Updated 11 months ago
- ☆17Aug 5, 2025Updated 10 months ago
- extension for text WebUI☆20Aug 7, 2025Updated 10 months ago
- A toy Inspect implementation of the Bliss Attractor eval from Claude 4 System Card Welfare Assessment☆38Jun 5, 2025Updated last year
- FamilyBench evaluation tool for testing the relational reasoning capabilities of Large Language Models (LLMs).☆47May 4, 2026Updated last month
- This benchmark tests how well LLMs incorporate a set of 10 mandatory story elements (characters, objects, core concepts, attributes, moti…☆388Updated this week
- Verifiers for LLM Reinforcement Learning☆83Sep 11, 2025Updated 9 months ago
- ☆75Mar 10, 2026Updated 3 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆19Oct 1, 2025Updated 8 months ago
- [ICML 2026] Official code release for paper "Temporal Score Rescaling for Temperature Sampling in Diffusion and Flow Models"☆46May 26, 2026Updated 3 weeks ago
- ☆347Mar 5, 2026Updated 3 months ago
- Limopola is an AI platform that allows you to communicate with a wide range of AI models. It features autonomous agents, model-agnostic r…☆111Dec 13, 2025Updated 6 months ago
- A ComfyUI extension for OmniGen2☆49Jul 1, 2025Updated 11 months ago
- Easy to use interface for the Whisper model optimized for all GPUs!☆542Feb 15, 2026Updated 4 months ago
- Interactive levels adjustment node for ComfyUI that provides a real-time levels adjustment tool directly within the user interface. It al…☆42Aug 23, 2025Updated 9 months ago