devichand579 / HPTLinks
code for Hierarchical Prompting Taxonomy: A Universal Evaluation Framework for Large Language Models Aligned with Human Cognitive Principles
β23Updated this week
Alternatives and similar repositories for HPT
Users that are interested in HPT are comparing it to the libraries listed below
Sorting:
- β40Updated 7 months ago
- ππ§ Easily experiment with popular language agents across diverse reasoning/decision-making benchmarks!β52Updated 2 weeks ago
- OneEdit: A Neural-Symbolic Collaboratively Knowledge Editing System.β19Updated 9 months ago
- β66Updated 3 months ago
- RAGLight is a lightweight and modular Python library for implementing Retrieval-Augmented Generation (RAG), Agentic RAG and RAT (Retrievaβ¦β30Updated 4 months ago
- β47Updated 9 months ago
- Data preparation code for CrystalCoder 7B LLMβ45Updated last year
- Nexusflow function call, tool use, and agent benchmarks.β27Updated 7 months ago
- The Benefits of a Concise Chain of Thought on Problem Solving in Large Language Modelsβ22Updated 7 months ago
- β24Updated 10 months ago
- The Library for LLM-based multi-agent applicationsβ87Updated this week
- EMNLP 2024 "Re-reading improves reasoning in large language models". Simply repeating the question to get bidirectional understanding forβ¦β26Updated 7 months ago
- LLM reads a paper and produce a working prototypeβ58Updated 3 months ago
- β18Updated 2 weeks ago
- β57Updated 7 months ago
- Code for paper called Self-Training Elicits Concise Reasoning in Large Language Modelsβ39Updated 3 months ago
- β53Updated 8 months ago
- β54Updated 3 weeks ago
- β96Updated 10 months ago
- Official Repo for The Paper "Talk Structurally, Act Hierarchically: A Collaborative Framework for LLM Multi-Agent Systems"β55Updated 5 months ago
- Verifiers for LLM Reinforcement Learningβ65Updated 3 months ago
- Source code for our paper: "SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals".β68Updated last year
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)β91Updated 6 months ago
- β19Updated 3 weeks ago
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Modelsβ¦β35Updated last year
- Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data generaβ¦β75Updated this week
- AgentSynth: Scalable Task Generation for Generalist Computer-Use Agentsβ25Updated last month
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?β71Updated 4 months ago
- [ACL 2025] Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systemsβ97Updated last month
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignmentβ60Updated 10 months ago