KYLN24 / CritiQLinks
Repository of the paper ''CritiQ: Mining Data Quality Criteria from Human Preferences". Code for CritiQ Flow & Training CritiQ Scorer.
☆18Updated 3 months ago
Alternatives and similar repositories for CritiQ
Users that are interested in CritiQ are comparing it to the libraries listed below
Sorting:
- ☆36Updated 11 months ago
- Reformatted Alignment☆113Updated 11 months ago
- The official repository of the Omni-MATH benchmark.☆87Updated 8 months ago
- ☆50Updated last year
- IKEA: Reinforced Internal-External Knowledge Synergistic Reasoning for Efficient Adaptive Search Agent☆64Updated 3 months ago
- The implementation of paper "LLM Critics Help Catch Bugs in Mathematics: Towards a Better Mathematical Verifier with Natural Language Fee…☆38Updated last year
- [ACL 2024] Code for "MoPS: Modular Story Premise Synthesis for Open-Ended Automatic Story Generation"☆38Updated last year
- The GitHub repository for the paper "Self-prompted Chain-of-Thought on Large Language Models for Open-domain Multi-hop Reasoning" accepte…☆19Updated last year
- ☆162Updated 4 months ago
- ☆110Updated last year
- ☆50Updated 2 months ago
- xVerify: Efficient Answer Verifier for Reasoning Model Evaluations☆128Updated 4 months ago
- RM-R1: Unleashing the Reasoning Potential of Reward Models☆127Updated 2 months ago
- Official Code for "Coser: Coordinating LLM-Based Persona Simulation of Established Roles"☆126Updated 2 months ago
- This repo aims to record resource of role-playing abilities in LLMs, including dataset, paper, application, etc.☆128Updated 11 months ago
- This is the code repo for the paper "Learning to Route Queries Across Knowledge Bases for Step-wise Retrieval-Augmented Reasoning".☆25Updated 2 weeks ago
- ☆57Updated 10 months ago
- [EMNLP 2025] CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward☆44Updated 3 weeks ago
- We systematically studied the influencing factors when LLM generates benchmarks,By using our code, you can generate high-quality QA datas…☆19Updated 3 months ago
- Code and Data for the paper "Evaluating Character Understanding of Large Language Models via Character Profiling from Fictional Works".☆20Updated last year
- Official code for the paper: InCharacter: Evaluating Personality Fidelity in Role-Playing Agents through Psychological Interviews (previo…☆85Updated 3 months ago
- Source code of "Reasons to Reject? Aligning Language Models with Judgments"☆58Updated last year
- ☆20Updated last year
- [ACL-25] We introduce ScaleQuest, a scalable, novel and cost-effective data synthesis method to unleash the reasoning capability of LLMs.☆64Updated 10 months ago
- ☆16Updated 2 years ago
- A simple GPT-based evaluation tool for multi-aspect, interpretable assessment of LLMs.☆87Updated last year
- Repo for paper "Tell Me More! Towards Implicit User Intention Understanding of Language Model Driven Agents"☆56Updated last year
- a survey of long-context LLMs from four perspectives, architecture, infrastructure, training, and evaluation☆56Updated 5 months ago
- ☆19Updated 8 months ago
- [ICML'2024] Can AI Assistants Know What They Don't Know?☆83Updated last year