devichand579 / HPTLinks
code for Hierarchical Prompting Taxonomy: A Universal Evaluation Framework for Large Language Models Aligned with Human Cognitive Principles
β24Updated 3 months ago
Alternatives and similar repositories for HPT
Users that are interested in HPT are comparing it to the libraries listed below
Sorting:
- ππ§ Easily experiment with popular language agents across diverse reasoning/decision-making benchmarks!β54Updated 3 months ago
- OneEdit: A Neural-Symbolic Collaboratively Knowledge Editing System.β18Updated last year
- β40Updated 10 months ago
- β11Updated 11 months ago
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)β91Updated 9 months ago
- β67Updated 7 months ago
- β50Updated last year
- β55Updated 11 months ago
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Modelsβ¦β37Updated last year
- β58Updated 4 months ago
- Official Repo for The Paper "Talk Structurally, Act Hierarchically: A Collaborative Framework for LLM Multi-Agent Systems"β57Updated 8 months ago
- β23Updated 3 weeks ago
- Nexusflow function call, tool use, and agent benchmarks.β29Updated 10 months ago
- Moatless Testbeds allows you to create isolated testbed environments in a Kubernetes cluster where you can apply code changes through gitβ¦β14Updated 6 months ago
- The Benefits of a Concise Chain of Thought on Problem Solving in Large Language Modelsβ22Updated 11 months ago
- Open Implementations of LLM Analysesβ107Updated last year
- EMNLP 2024 "Re-reading improves reasoning in large language models". Simply repeating the question to get bidirectional understanding forβ¦β27Updated 10 months ago
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Modelsβ111Updated 6 months ago
- LLM reads a paper and produce a working prototypeβ57Updated 6 months ago
- β61Updated 10 months ago
- A framework for pitting LLMs against each other in an evolving library of games ββ34Updated 6 months ago
- Codebase accompanying the Summary of a Haystack paper.β79Updated last year
- ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems. (EMNLP 2024 Demo)β88Updated this week
- OLMost every training recipe you need to perform data interventions with the OLMo family of models.β52Updated this week
- β79Updated last year
- This repository contains popular code generation frameworks such as MapCoder, CodeSIM.β64Updated 4 months ago
- never forget anything again! combine AI and intelligent tooling for a local knowledge base to track catalogue, annotate, and plan for youβ¦β37Updated last year
- Example implementation of Iteration of Tought - Gives a star if you like the projectβ41Updated 10 months ago
- β25Updated 5 months ago
- β30Updated last year