devichand579 / HPTLinks
code for Hierarchical Prompting Taxonomy: A Universal Evaluation Framework for Large Language Models Aligned with Human Cognitive Principles
β23Updated 5 months ago
Alternatives and similar repositories for HPT
Users that are interested in HPT are comparing it to the libraries listed below
Sorting:
- ππ§ Easily experiment with popular language agents across diverse reasoning/decision-making benchmarks!β53Updated 6 months ago
- β39Updated last year
- EMNLP 2024 "Re-reading improves reasoning in large language models". Simply repeating the question to get bidirectional understanding forβ¦β27Updated last year
- β54Updated last year
- Jina VDR is a multilingual, multi-domain benchmark for visual document retrievalβ37Updated 5 months ago
- β28Updated last month
- β55Updated last year
- β11Updated last year
- LLM reads a paper and produce a working prototypeβ60Updated 9 months ago
- β67Updated 9 months ago
- β23Updated last year
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)β91Updated 11 months ago
- β55Updated 4 months ago
- Nexusflow function call, tool use, and agent benchmarks.β30Updated last year
- Voyage AI Official Python Libraryβ90Updated 3 weeks ago
- β63Updated 6 months ago
- AgentSynth: Scalable Task Generation for Generalist Computer-Use Agentsβ35Updated 3 months ago
- OneEdit: A Neural-Symbolic Collaboratively Knowledge Editing System.β18Updated last year
- β63Updated last year
- Official Repo for The Paper "Talk Structurally, Act Hierarchically: A Collaborative Framework for LLM Multi-Agent Systems"β59Updated 10 months ago
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Modelsβ¦β38Updated last year
- Codebase accompanying the Summary of a Haystack paper.β80Updated last year
- A framework for high-fidelity retrieval augmented generation in industrial knowledge bases. Integrates jargon identification, context recβ¦β35Updated 2 months ago
- Solve Geometric & Graph Problems with Large Language Modelsβ32Updated 2 years ago
- Measuring RAG solutions throughput and latencyβ18Updated last year
- The Library for LLM-based multi-agent applicationsβ100Updated 5 months ago
- ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems. (EMNLP 2024 Demo)β89Updated 3 weeks ago
- Multi-Granularity LLM Debugger [ICSE2026]β94Updated 6 months ago
- This repository contains popular code generation frameworks such as MapCoder, CodeSIM.β68Updated 6 months ago
- β25Updated 7 months ago