devichand579 / HPTLinks
code for Hierarchical Prompting Taxonomy: A Universal Evaluation Framework for Large Language Models Aligned with Human Cognitive Principles
β23Updated 6 months ago
Alternatives and similar repositories for HPT
Users that are interested in HPT are comparing it to the libraries listed below
Sorting:
- ππ§ Easily experiment with popular language agents across diverse reasoning/decision-making benchmarks!β53Updated 6 months ago
- β39Updated last year
- β67Updated 10 months ago
- Nexusflow function call, tool use, and agent benchmarks.β30Updated last year
- β55Updated last year
- β11Updated last year
- Jina VDR is a multilingual, multi-domain benchmark for visual document retrievalβ38Updated 5 months ago
- AgentSynth: Scalable Task Generation for Generalist Computer-Use Agentsβ36Updated 3 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignmentβ61Updated last year
- β54Updated 2 weeks ago
- β61Updated 7 months ago
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)β92Updated last year
- Open Implementations of LLM Analysesβ107Updated last year
- The Library for LLM-based multi-agent applicationsβ103Updated 6 months ago
- Moatless Testbeds allows you to create isolated testbed environments in a Kubernetes cluster where you can apply code changes through gitβ¦β14Updated 9 months ago
- β31Updated last year
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Modelsβ¦β40Updated last year
- Small, simple agent task environments for training and evaluationβ19Updated last year
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Modelsβ114Updated 9 months ago
- Data preparation code for CrystalCoder 7B LLMβ45Updated last year
- β32Updated last year
- A framework for high-fidelity retrieval augmented generation in industrial knowledge bases. Integrates jargon identification, context recβ¦β35Updated 2 months ago
- β23Updated last year
- Verifiers for LLM Reinforcement Learningβ80Updated 9 months ago
- The Benefits of a Concise Chain of Thought on Problem Solving in Large Language Modelsβ24Updated last year
- LLM reads a paper and produce a working prototypeβ60Updated 9 months ago
- This repository contains popular code generation frameworks such as MapCoder, CodeSIM.β69Updated 7 months ago
- Everything for the Paper: 'Evoke: Evoking Critical Thinking Abilities in LLMs via Reviewer-Author Prompt Editing'β19Updated 2 years ago
- β63Updated last year
- π A deep-dive into HyDE for Advanced LLM RAG + π‘ Introducing AutoHyDE, a semi-supervised framework to improve the effectiveness, coveraβ¦β34Updated last year