Zhiyuan-Zeng / EvalTreeView external linksLinks
[COLM 2025] EvalTree: Profiling Language Model Weaknesses via Hierarchical Capability Trees
☆31Jul 11, 2025Updated 7 months ago
Alternatives and similar repositories for EvalTree
Users that are interested in EvalTree are comparing it to the libraries listed below
Sorting:
- WONDERBREAD benchmark + dataset for BPM tasks☆34Jul 30, 2025Updated 6 months ago
- Official repository for Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning [ICLR 2025]☆50Jan 24, 2025Updated last year
- Implementation of the paper: "Turning Tables: Generating Examples from Semi-structured Tables for Endowing Language Models with Reasoning…☆22Nov 2, 2021Updated 4 years ago
- Code for paper "Search Methods for Sufficient, Socially-Aligned Feature Importance Explanations with In-Distribution Counterfactuals"☆18Oct 17, 2022Updated 3 years ago
- Python package for serving a local search engine. One command to download and serve a datastore---that's it 😎.☆25Jun 6, 2025Updated 8 months ago
- A simple lightweight Model Context Protocol (MCP) server integration framework☆17Jan 23, 2026Updated 3 weeks ago
- ☆33Feb 2, 2026Updated 2 weeks ago
- ☆29Oct 24, 2025Updated 3 months ago
- A framework for pitting LLMs against each other in an evolving library of games ⚔☆34Apr 20, 2025Updated 9 months ago
- Structured TRIZ prompt engineering for LLMs in an open, portable XML format – MIT licensed.☆14Nov 11, 2025Updated 3 months ago
- ☆18Jun 10, 2025Updated 8 months ago
- ☆26Updated this week
- [ICLR 2024] Evaluating Large Language Models at Evaluating Instruction Following☆136Jul 8, 2024Updated last year
- ☆27Jun 12, 2023Updated 2 years ago
- CoachLint is your AI coding coach. It guides you through errors instead of just solving them for you.☆23Nov 20, 2025Updated 2 months ago
- MAGELLAN: Metacognitive predictions of learning progress guide autotelic LLM agents in large goal spaces☆10Mar 24, 2025Updated 10 months ago
- 🪝PISCES - Precise In-Parameter Suppression for Concept EraSure in Large Language Models☆12May 30, 2025Updated 8 months ago
- VibEx (vx) is a developer-friendly CLI tool that streamlines the process of working with AI coding assistants. It helps developers prepar…☆28May 17, 2025Updated 8 months ago
- Glitch Gremlin AI☆15Apr 5, 2025Updated 10 months ago
- Authors' implementation of the paper Adaptive Information Seeking for Open-Domain Question Answering, published in EMNLP 2021.☆38May 16, 2023Updated 2 years ago
- React Native, Right Now (rn-rn)☆18Sep 2, 2025Updated 5 months ago
- A Discord bot to retrieve Shopify Orders and Statistics☆10Dec 9, 2025Updated 2 months ago
- AutonomousSphere is an agentic collaboration server. Agents talk, act, and use tools like teammates. Federated servers form an internet o…☆16May 13, 2025Updated 9 months ago
- Reference implementation of algorithms for reinforcement learning and Markov decision processes.☆12Jan 28, 2021Updated 5 years ago
- SYSTEM PROMPT TRANSPARENCY FOR ALL☆12May 22, 2025Updated 8 months ago
- A powerful AI prompt engineering tool that transforms simple instructions into detailed, context-rich prompts using Google's Gemini Pro t…☆15Aug 28, 2025Updated 5 months ago
- Access to AI for free for anyone inside your Visual Studio. This is a Visual Studio extension.☆19Dec 29, 2025Updated last month
- "Open-source toolkit (Python Library, Registry API, CLI) for secure, decentralized AI agent interoperability using A2A/MCP."☆14May 10, 2025Updated 9 months ago
- Pascal2 Harvest project QuEst☆14Sep 15, 2014Updated 11 years ago
- 💀 gigasmol: a lightweight wrapper for gigachat api model for seamless use with smolagents.☆15Oct 23, 2025Updated 3 months ago
- AI Tasks. A LLM integrated agent orchestration tool for Liferay.☆14May 16, 2025Updated 9 months ago
- Shakey OS Mobile AI Framework for React Native allowing people to build React Native apps for IOS and Android with AI tooling and wallet …☆28Feb 3, 2025Updated last year
- IBM watsonx Code Assistant for Red Hat Ansible Lightspeed demystifies the process of Ansible Playbook creation through generative AI-powe…☆19Sep 18, 2025Updated 4 months ago
- Emphasizes AI-based projects for various companies.☆15Apr 1, 2025Updated 10 months ago
- Rationales for Sequential Predictions☆40Mar 10, 2022Updated 3 years ago
- Code for "Tracing Knowledge in Language Models Back to the Training Data"☆39Dec 27, 2022Updated 3 years ago
- Code for Massive-scale Decoding for Text Generation using Lattices☆44Jul 29, 2022Updated 3 years ago
- ☆11Feb 11, 2020Updated 6 years ago
- ☆11Nov 13, 2020Updated 5 years ago