varungodbole / llm-evalsLinks
☆18Updated 5 months ago
Alternatives and similar repositories for llm-evals
Users that are interested in llm-evals are comparing it to the libraries listed below
Sorting:
- Implementation of the board game Codenames, re-imagined as a collaborative game between LLM agents☆106Updated 8 months ago
- Easiest way to give context to LLMs; Attachments has the ambition to be the general funnel for any files to be transformed into images+te…☆318Updated last month
- ShellSage saves sysadmins’ sanity by solving shell script snafus super swiftly☆373Updated this week
- ☆142Updated 8 months ago
- ☆234Updated 7 months ago
- ☆92Updated last year
- ☆106Updated 9 months ago
- Pixelagent — Multimodal stateful agents☆220Updated 4 months ago
- This repo tracks the opened and merged PRs by the top SWE coding agents by OpenAI, GitHub, and others. Updates every 3 hours.☆292Updated this week
- ☆172Updated last week
- ☆89Updated last year
- The State Of The Art, intelligence☆154Updated 2 months ago
- Metadspy: The framework for specifying—not programming—language models☆88Updated 4 months ago
- Lightweight Nearest Neighbors with Flexible Backends☆312Updated last month
- A framework for collecting a large human-sourced chain-of-thoughts dataset☆23Updated last year
- A framework for optimizing DSPy programs with RL☆214Updated this week
- Claudette is Claude's friend☆286Updated 2 weeks ago
- lossily compress representation vectors using product quantization☆59Updated last week
- An automated tool for discovering insights from research papaer corpora☆138Updated last year
- ☆81Updated 2 months ago
- Claude Deep Research config for Claude Code.☆223Updated 7 months ago
- Parallel Reasoning: llm-consortium orchestrates mulitple LLMs, iteratively refines & achieves consensus.☆368Updated last week
- AI eXplainable Inference & Search. Open Sourcing on-premise, ultra-fast latency intelligence to all.☆35Updated 8 months ago
- Minimal agent runtime built with DSPy modules and a thin Python loop. Includes CLI, FastAPI server, and eval harness with OpenAI/Ollama s…☆63Updated last month
- Minimal example of MCP for parsing llms.txt☆40Updated 6 months ago
- ☆211Updated this week
- run paligemma in real time☆133Updated last year
- Provider-agnostic, open-source evaluation infrastructure for language models☆641Updated this week
- Using the moondream VLM with optical flow for promptable object tracking☆71Updated 8 months ago
- Towards Human-Friendly, Fast Learning and Adaptable Agent Communities☆153Updated 3 months ago