facebookresearch/Multi-IF

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/facebookresearch/Multi-IF)

facebookresearch / Multi-IF

The evaluation code for MultiIF multi-turn and multi-lingual instruction following

☆63

Alternatives and similar repositories for Multi-IF

Users that are interested in Multi-IF are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

meowpass / FollowComplexInstruction
View on GitHub
Official implementation of the paper "From Complex to Simple: Enhancing Multi-Constraint Complex Instruction Following Ability of Large L…
☆55Jun 24, 2024Updated 2 years ago
facebookresearch / AdvancedIF
View on GitHub
This is the github to open source benchmark AdvancedIF, see LAMA L1387358RCRO
☆36Nov 26, 2025Updated 7 months ago
Cohere-Labs-Community / m-rewardbench
View on GitHub
Evaluating Reward Models in Multilingual Settings (ACL Main '25)
☆42May 16, 2025Updated last year
yuleiqin / RAIF
View on GitHub
A Recipe for Building LLM Reasoners to Solve Complex Instructions
☆32Oct 9, 2025Updated 9 months ago
OSU-NLP-Group / Pangu
View on GitHub
☆12Jul 10, 2023Updated 3 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
SalesforceAIResearch / FoFo
View on GitHub
☆27Jun 2, 2026Updated last month
heyunh2015 / PARADE_dataset
View on GitHub
code and dataset of EMNLP 2020 paper "PARADE: A New Dataset for Paraphrase Identification Requiring Computer Science Domain Knowledge"
☆12Nov 6, 2020Updated 5 years ago
cdhx / QDTQA
View on GitHub
Code for AAAI 2023 research track paper "Question Decomposition Tree for Answering Complex Questions over Knowledge Bases"
☆17Jan 3, 2024Updated 2 years ago
THU-KEG / Crab
View on GitHub
[CIKM 2025] Constraint Back-translation Improves Complex Instruction Following of Large Language Models
☆18May 23, 2025Updated last year
YJiangcm / FollowBench
View on GitHub
[ACL 2024] FollowBench: A Multi-level Fine-grained Constraints Following Benchmark for Large Language Models
☆118Jun 12, 2025Updated last year
Abbey4799 / CELLO
View on GitHub
Code and data for the paper "Can Large Language Models Understand Real-World Complex Instructions?"(AAAI2024)
☆51Apr 19, 2024Updated 2 years ago
leileqiTHU / Attacker
View on GitHub
The repo for using the model https://huggingface.co/thu-coai/Attacker-v0.1
☆13Apr 23, 2025Updated last year
Junjie-Ye / MulDimIF
View on GitHub
[ACL 2026] A Multi-Dimensional Constraint Framework for Evaluating and Improving Instruction Following in Large Language Models
☆23Jul 10, 2026Updated last week
QwenLM / AutoIF
View on GitHub
☆335Jul 25, 2024Updated last year
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
icip-cas / AutoAlign
View on GitHub
A toolkit for automated alignment research.
☆15Jul 3, 2026Updated 2 weeks ago
PKU-Baichuan-MLSystemLab / CFBench
View on GitHub
CFBench: A Comprehensive Constraints-Following Benchmark for LLMs
☆55Aug 26, 2024Updated last year
scaleapi / SWE-Interact
View on GitHub
New testbed of interactive SWE tasks for coding agents, set in a realistic multi-turn developer driven environment
☆23Jun 30, 2026Updated 3 weeks ago
ysy-phoenix / evalhub
View on GitHub
All-in-one benchmarking platform for evaluating LLM.
☆15Nov 12, 2025Updated 8 months ago
THU-KEG / VerIF
View on GitHub
[EMNLP 2025] Verification Engineering for RL in Instruction Following
☆57Mar 30, 2026Updated 3 months ago
pldlgb / nuggets
View on GitHub
☆89Dec 29, 2023Updated 2 years ago
QwenLM / CodeElo
View on GitHub
CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings
☆74Feb 3, 2025Updated last year
heyunh2015 / AttList
View on GitHub
data and code of AttList from CIKM2019
☆22Feb 1, 2020Updated 6 years ago
potsawee / mqag0
View on GitHub
MQAG: Multiple-choice Question Answering and Generation for Assessing Information Consistency
☆31Sep 11, 2023Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Rainier-rq / verl-if
View on GitHub
Official implementation of the paper "Instructions are all you need: Self-supervised Reinforcement Learning for Instruction Following"
☆40Jan 11, 2026Updated 6 months ago
open-compass / CriticEval
View on GitHub
[NeurIPS 2024] A comprehensive benchmark for evaluating critique ability of LLMs
☆49Nov 29, 2024Updated last year
michelecafagna26 / cider
View on GitHub
Pythonic wrappers for Cider/CiderD evaluation metrics. Provides CIDEr as well as CIDEr-D (CIDEr Defended) which is more robust to gaming …
☆13Dec 4, 2025Updated 7 months ago
mtbench101 / mt-bench-101
View on GitHub
[ACL 2024] MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues
☆152Jul 24, 2024Updated last year
GAIR-NLP / MetaCritique
View on GitHub
Evaluate the Quality of Critique
☆37Jun 1, 2024Updated 2 years ago
Tongyi-CCAI / Complex-IF
View on GitHub
☆34Jan 26, 2026Updated 5 months ago
zhaochen0110 / Cotempqa
View on GitHub
Code and data for "Living in the Moment: Can Large Language Models Grasp Co-Temporal Reasoning?" (ACL 2024)
☆31Jul 3, 2024Updated 2 years ago
PolarisRisingWar / Note-of-PyTorch-60-Minutes-Tutorial
View on GitHub
60分钟闪击速成PyTorch（Deep Learning with PyTorch: A 60 Minute Blitz）相关文件
☆30Dec 8, 2021Updated 4 years ago
huggingface / ioi
View on GitHub
☆42Mar 26, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
tianyi-lab / Cherry_LLM
View on GitHub
[NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other mo…
☆416Jun 25, 2025Updated last year
hkust-nlp / deita
View on GitHub
Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]
☆599Dec 9, 2024Updated last year
AadityaRavindran / gym-cartpolemod
View on GitHub
Modified CartPole-v0 OpenAI Gym environment with various noisy cases and Reinforcement Learning based controller
☆10Dec 5, 2017Updated 8 years ago
F2-Song / Weak-to-Strong-Decoding
View on GitHub
The official implementation of "Well Begun is Half Done: Low-resource Preference Alignment by Weak-to-Strong Decoding"
☆22Jun 26, 2025Updated last year
Dahoas / QDSyntheticData
View on GitHub
☆14Aug 15, 2024Updated last year
causalNLP / amr_llm
View on GitHub
This repo explores how AMR to address tasks difficult for LLMs
☆13Jan 15, 2024Updated 2 years ago
ZhaolinGao / REFUEL
View on GitHub
Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF
☆25Oct 8, 2024Updated last year