KYLN24 / CritiQLinks
Repository of the paper ''CritiQ: Mining Data Quality Criteria from Human Preferences". Code for CritiQ Flow & Training CritiQ Scorer.
☆18Updated 2 months ago
Alternatives and similar repositories for CritiQ
Users that are interested in CritiQ are comparing it to the libraries listed below
Sorting:
- We systematically studied the influencing factors when LLM generates benchmarks,By using our code, you can generate high-quality QA datas…☆19Updated last month
- ☆50Updated last month
- Reformatted Alignment☆113Updated 9 months ago
- ☆155Updated 2 months ago
- Official completion of “Training on the Benchmark Is Not All You Need”.☆34Updated 6 months ago
- ☆88Updated 8 months ago
- Source code of "Reasons to Reject? Aligning Language Models with Judgments"☆58Updated last year
- Official Code for "Coser: Coordinating LLM-Based Persona Simulation of Established Roles"☆106Updated 2 weeks ago
- ☆19Updated last year
- Repo for paper "Tell Me More! Towards Implicit User Intention Understanding of Language Model Driven Agents"☆53Updated last year
- IKEA: Reinforced Internal-External Knowledge Synergistic Reasoning for Efficient Adaptive Search Agent☆60Updated 2 months ago
- Unleashing the Power of Cognitive Dynamics on Large Language Models☆62Updated 9 months ago
- The official github repo for MixEval-X, the first any-to-any, real-world benchmark.☆14Updated 5 months ago
- ☆83Updated last year
- ☆50Updated last year
- ☆36Updated 10 months ago
- ☆106Updated last year
- The official repository of the Omni-MATH benchmark.☆85Updated 6 months ago
- Code and Data for EMNLP 2024 Paper "Neeko: Leveraging Dynamic LoRA for Efficient Multi-Character Role-Playing Agent"☆130Updated 3 months ago
- a survey of long-context LLMs from four perspectives, architecture, infrastructure, training, and evaluation☆55Updated 3 months ago
- The official repo for "AceCoder: Acing Coder RL via Automated Test-Case Synthesis" [ACL25]☆88Updated 3 months ago
- ☆166Updated 3 weeks ago
- A simple GPT-based evaluation tool for multi-aspect, interpretable assessment of LLMs.☆85Updated last year
- The official Github repository for paper "R^2AG: Incorporating Retrieval Information into Retrieval Augmented Generation" (EMNLP 2024 Fin …☆33Updated 7 months ago
- xVerify: Efficient Answer Verifier for Reasoning Model Evaluations☆121Updated 3 months ago
- Official implementation of the paper "From Complex to Simple: Enhancing Multi-Constraint Complex Instruction Following Ability of Large L…☆50Updated last year
- ☆95Updated last year
- Code and Data for the paper "Evaluating Character Understanding of Large Language Models via Character Profiling from Fictional Works".☆18Updated 11 months ago
- Self-Knowledge Guided Retrieval Augmentation for Large Language Models (EMNLP Findings 2023)☆26Updated last year
- [ACL-25] We introduce ScaleQuest, a scalable, novel and cost-effective data synthesis method to unleash the reasoning capability of LLMs.☆63Updated 8 months ago