devichand579 / HPTLinks
code for Hierarchical Prompting Taxonomy: A Universal Evaluation Framework for Large Language Models Aligned with Human Cognitive Principles
☆23Updated 3 weeks ago
Alternatives and similar repositories for HPT
Users that are interested in HPT are comparing it to the libraries listed below
Sorting:
- ☆11Updated 9 months ago
- ☆40Updated 8 months ago
- ☆48Updated 10 months ago
- Nexusflow function call, tool use, and agent benchmarks.☆29Updated 8 months ago
- ☆66Updated 4 months ago
- ☆54Updated 9 months ago
- Data preparation code for CrystalCoder 7B LLM☆45Updated last year
- The original Shared Recurrent Memory Transformer implementation☆30Updated last month
- OneEdit: A Neural-Symbolic Collaboratively Knowledge Editing System.☆19Updated 10 months ago
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)☆92Updated 7 months ago
- The official implementation for paper "Agentic-R1: Distilled Dual-Strategy Reasoning"☆92Updated last month
- LLM reads a paper and produce a working prototype☆57Updated 4 months ago
- Verifiers for LLM Reinforcement Learning☆70Updated 4 months ago
- Universal text classifier for generative models☆24Updated last year
- Official repository for RAGViz: Diagnose and Visualize Retrieval-Augmented Generation [EMNLP 2024]☆85Updated 7 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆60Updated 11 months ago
- [ACL 2025] Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆101Updated 2 months ago
- ☆23Updated 6 months ago
- The Benefits of a Concise Chain of Thought on Problem Solving in Large Language Models☆22Updated 8 months ago
- ☆59Updated 8 months ago
- ☆31Updated last year
- Meta-CoT: Generalizable Chain-of-Thought Prompting in Mixed-task Scenarios with Large Language Models☆97Updated last year
- EMNLP 2024 "Re-reading improves reasoning in large language models". Simply repeating the question to get bidirectional understanding for…☆26Updated 8 months ago
- Open Implementations of LLM Analyses☆106Updated 10 months ago
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…☆36Updated last year
- ☆20Updated 4 months ago
- Small, simple agent task environments for training and evaluation☆18Updated 9 months ago
- ☆24Updated 11 months ago
- Code Implementation, Evaluations, Documentation, Links and Resources for Min P paper☆39Updated last week
- NeurIPS 2023 - Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer☆43Updated last year