PKU-Baichuan-MLSystemLab/CFBench

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/PKU-Baichuan-MLSystemLab/CFBench)

PKU-Baichuan-MLSystemLab / CFBench

CFBench: A Comprehensive Constraints-Following Benchmark for LLMs

☆55

Alternatives and similar repositories for CFBench

Users that are interested in CFBench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

thu-coai / ComplexBench
View on GitHub
Benchmarking Complex Instruction-Following with Multiple Constraints Composition (NeurIPS 2024 Datasets and Benchmarks Track)
☆102Feb 20, 2025Updated last year
YJiangcm / FollowBench
View on GitHub
[ACL 2024] FollowBench: A Multi-level Fine-grained Constraints Following Benchmark for Large Language Models
☆118Jun 12, 2025Updated last year
Rainier-rq / verl-if
View on GitHub
Official implementation of the paper "Instructions are all you need: Self-supervised Reinforcement Learning for Instruction Following"
☆40Jan 11, 2026Updated 6 months ago
yizhilll / CIF-Bench
View on GitHub
☆18Feb 29, 2024Updated 2 years ago
qinyiwei / InfoBench
View on GitHub
☆61Aug 22, 2024Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
yuleiqin / RAIF
View on GitHub
A Recipe for Building LLM Reasoners to Solve Complex Instructions
☆32Oct 9, 2025Updated 9 months ago
meowpass / FollowComplexInstruction
View on GitHub
Official implementation of the paper "From Complex to Simple: Enhancing Multi-Constraint Complex Instruction Following Ability of Large L…
☆55Jun 24, 2024Updated 2 years ago
kkk-an / UltraIF
View on GitHub
Code of EMNLP 2025 paper 'UltraIF: Advancing Instruction Following from the Wild'.
☆21Apr 3, 2025Updated last year
Junjie-Ye / MulDimIF
View on GitHub
[ACL 2026] A Multi-Dimensional Constraint Framework for Evaluating and Improving Instruction Following in Large Language Models
☆23Jul 10, 2026Updated 2 weeks ago
Abbey4799 / CELLO
View on GitHub
Code and data for the paper "Can Large Language Models Understand Real-World Complex Instructions?"(AAAI2024)
☆51Apr 19, 2024Updated 2 years ago
CLUEbenchmark / Math24o
View on GitHub
Math24o: 高中奥林匹克数学竞赛测评集 High School Olympiad Mathematics Chinese Benchmark
☆14Mar 27, 2025Updated last year
UKPLab / acl2022-impli
View on GitHub
☆13Mar 15, 2022Updated 4 years ago
CriticBench / CriticBench
View on GitHub
[ACL 2024 Findings] CriticBench: Benchmarking LLMs for Critique-Correct Reasoning
☆31Mar 5, 2024Updated 2 years ago
PKU-Baichuan-MLSystemLab / PAS
View on GitHub
☆53Sep 11, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
KodCode-AI / code-r1
View on GitHub
Reproducing R1 for Code with Reliable Rewards
☆13Apr 9, 2025Updated last year
alycialee / beyond-scale-language-data-diversity
View on GitHub
☆13Updated this week
mdcnn / CVPR-CLIC-Challenge
View on GitHub
☆10Feb 21, 2020Updated 6 years ago
wzhouad / WPO
View on GitHub
Code and models for EMNLP 2024 paper "WPO: Enhancing RLHF with Weighted Preference Optimization"
☆41Sep 24, 2024Updated last year
THU-KEG / Crab
View on GitHub
[CIKM 2025] Constraint Back-translation Improves Complex Instruction Following of Large Language Models
☆18May 23, 2025Updated last year
BaichuanSEED / BaichuanSEED.github.io
View on GitHub
Official Repository for Paper "BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Compet…
☆18Aug 28, 2024Updated last year
josejg / instruction_following_eval
View on GitHub
Instruction Following Eval
☆18Jan 16, 2025Updated last year
baichuan-inc / Baichuan-Omni-1.5
View on GitHub
☆193Feb 8, 2025Updated last year
rookie-joe / AutoPSV
View on GitHub
☆50Oct 28, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
QwenLM / AutoIF
View on GitHub
☆336Jul 25, 2024Updated last year
princeton-nlp / Collie
View on GitHub
[ICLR 2024] COLLIE: Systematic Construction of Constrained Text Generation Tasks
☆63Aug 2, 2023Updated 2 years ago
ZhangXJ199 / EDGE-GRPO
View on GitHub
Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity
☆22Aug 28, 2025Updated 10 months ago
BytedTsinghua-SIA / Enigmata
View on GitHub
Resources for the Enigmata Project.
☆82Aug 13, 2025Updated 11 months ago
pldlgb / nuggets
View on GitHub
☆89Dec 29, 2023Updated 2 years ago
zhaochen0110 / conflictbank
View on GitHub
Code and data for "ConflictBank: A Benchmark for Evaluating the Influence of Knowledge Conflicts in LLM" (NeurIPS 2024 Track Datasets and…
☆71May 16, 2025Updated last year
thu-coai / PICL
View on GitHub
Code for ACL2023 paper: Pre-Training to Learn in Context
☆106Jul 26, 2024Updated last year
mtbench101 / mt-bench-101
View on GitHub
[ACL 2024] MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues
☆152Jul 24, 2024Updated 2 years ago
MrBlankness / TPO
View on GitHub
Pytorch implementation of Tree Preference Optimization (TPO) (Accepted by ICLR'25)
☆28Apr 24, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
EsYoon7 / RLHF-TLCR
View on GitHub
[ACL'24 Findings] Official code for "TLCR: Token-Level Continuous Reward for Fine-grained Reinforcement Learning from Human Feedback"
☆12Dec 6, 2024Updated last year
SalesforceAIResearch / FoFo
View on GitHub
☆27Jun 2, 2026Updated last month
jdh-algo / JoyDataForge
View on GitHub
数据合成工具，简单高效的合成不同业务场景的大模型训练数据
☆46Jan 2, 2025Updated last year
a-m-team / a-m-models
View on GitHub
a-m-team's exploration in large language modeling
☆196May 29, 2025Updated last year
xiaoyuisrain / metaphor-understanding-challenge
View on GitHub
☆24Mar 8, 2024Updated 2 years ago
GAIR-NLP / MetaCritique
View on GitHub
Evaluate the Quality of Critique
☆37Jun 1, 2024Updated 2 years ago
piekey1994 / IOM
View on GitHub
Information-oriented Metric (IOM)
☆11Sep 2, 2020Updated 5 years ago