YJiangcm/FollowBench

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/YJiangcm/FollowBench)

YJiangcm / FollowBench

[ACL 2024] FollowBench: A Multi-level Fine-grained Constraints Following Benchmark for Large Language Models

☆118

Alternatives and similar repositories for FollowBench

Users that are interested in FollowBench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

YJiangcm / BMC
View on GitHub
[ICLR 2025] Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization
☆12Jan 26, 2025Updated last year
thu-coai / ComplexBench
View on GitHub
Benchmarking Complex Instruction-Following with Multiple Constraints Composition (NeurIPS 2024 Datasets and Benchmarks Track)
☆102Feb 20, 2025Updated last year
YJiangcm / improved_DeepLOB
View on GitHub
This repo contains my reimplementation and improvement of DeepLOB model.
☆32Apr 22, 2021Updated 5 years ago
qinyiwei / InfoBench
View on GitHub
☆61Aug 22, 2024Updated last year
meowpass / FollowComplexInstruction
View on GitHub
Official implementation of the paper "From Complex to Simple: Enhancing Multi-Constraint Complex Instruction Following Ability of Large L…
☆55Jun 24, 2024Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
YJiangcm / LTE
View on GitHub
[ACL 2024] Learning to Edit: Aligning LLMs with Knowledge Editing
☆37Aug 19, 2024Updated last year
PKU-Baichuan-MLSystemLab / CFBench
View on GitHub
CFBench: A Comprehensive Constraints-Following Benchmark for LLMs
☆55Aug 26, 2024Updated last year
ConiferLM / Conifer
View on GitHub
Conifer: Improving Complex Constrained Instruction-Following Ability of Large Language Models
☆91Apr 4, 2024Updated 2 years ago
Abbey4799 / CELLO
View on GitHub
Code and data for the paper "Can Large Language Models Understand Real-World Complex Instructions?"(AAAI2024)
☆51Apr 19, 2024Updated 2 years ago
QwenLM / AutoIF
View on GitHub
☆335Jul 25, 2024Updated last year
yizhilll / CIF-Bench
View on GitHub
☆18Feb 29, 2024Updated 2 years ago
PKU-Baichuan-MLSystemLab / SysBench
View on GitHub
SysBench: Can Large Language Models Follow System Messages?
☆40Sep 4, 2024Updated last year
princeton-nlp / Collie
View on GitHub
[ICLR 2024] COLLIE: Systematic Construction of Constrained Text Generation Tasks
☆63Aug 2, 2023Updated 2 years ago
kkk-an / UltraIF
View on GitHub
Code of EMNLP 2025 paper 'UltraIF: Advancing Instruction Following from the Wild'.
☆21Apr 3, 2025Updated last year
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
OFA-Sys / InsTag
View on GitHub
InsTag: A Tool for Data Analysis in LLM Supervised Fine-tuning
☆287Aug 20, 2023Updated 2 years ago
CriticBench / CriticBench
View on GitHub
[ACL 2024 Findings] CriticBench: Benchmarking LLMs for Critique-Correct Reasoning
☆31Mar 5, 2024Updated 2 years ago
YJiangcm / Movielens1M-movie-recommendation-system
View on GitHub
使用MovieLens数据集实现了基于Auto Encoder(AE), Variational Auto Encoder(VAE), BERT的深度学习电影推荐系统
☆77Dec 18, 2020Updated 5 years ago
hkust-nlp / deita
View on GitHub
Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]
☆599Dec 9, 2024Updated last year
YJiangcm / Lion
View on GitHub
[EMNLP 2023] Lion: Adversarial Distillation of Proprietary Large Language Models
☆210Feb 11, 2024Updated 2 years ago
QwenLM / online_merging_optimizers
View on GitHub
Implementations of online merging optimizers proposed by Online Merging Optimizers for Boosting Rewards and Mitigating Tax in Alignment
☆82Jun 19, 2024Updated 2 years ago
YJiangcm / PromCSE
View on GitHub
[EMNLP 2022] Improved Universal Sentence Embeddings with Prompt-based Contrastive Learning and Energy-based Learning
☆136Nov 17, 2023Updated 2 years ago
THUDM / AlignBench
View on GitHub
大模型多维度中文对齐评测基准 (ACL 2024)
☆430Oct 25, 2025Updated 8 months ago
mtbench101 / mt-bench-101
View on GitHub
[ACL 2024] MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues
☆152Jul 24, 2024Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
QwenLM / Self-Lengthen
View on GitHub
☆98Nov 6, 2024Updated last year
princeton-nlp / LLMBar
View on GitHub
[ICLR 2024] Evaluating Large Language Models at Evaluating Instruction Following
☆138Jul 8, 2024Updated 2 years ago
OFA-Sys / gsm8k-ScRel
View on GitHub
Codes and Data for Scaling Relationship on Learning Mathematical Reasoning with Large Language Models
☆268Sep 12, 2024Updated last year
conceptmath / conceptmath
View on GitHub
[ACL 2024 Findings] The official repo for "ConceptMath: A Bilingual Concept-wise Benchmark for Measuring Mathematical Reasoning of Large …
☆26May 29, 2024Updated 2 years ago
GAIR-NLP / ReAlign
View on GitHub
Reformatted Alignment
☆111Sep 23, 2024Updated last year
princeton-nlp / QuRating
View on GitHub
[ICML 2024] Selecting High-Quality Data for Training Language Models
☆204Dec 8, 2025Updated 7 months ago
ChengpengLi1003 / DotaMath
View on GitHub
☆30Dec 27, 2024Updated last year
YJiangcm / SST-2-sentiment-analysis
View on GitHub
Use BiLSTM_attention, BERT, ALBERT, RoBERTa, XLNet model to classify the SST-2 data set based on pytorch
☆111Dec 9, 2020Updated 5 years ago
RUCAIBox / JiuZhang3.0
View on GitHub
The code and data for the paper JiuZhang3.0
☆49May 26, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
yuleiqin / RAIF
View on GitHub
A Recipe for Building LLM Reasoners to Solve Complex Instructions
☆32Oct 9, 2025Updated 9 months ago
KbsdJames / Omni-MATH
View on GitHub
The official repository of the Omni-MATH benchmark.
☆94Dec 22, 2024Updated last year
TIGER-AI-Lab / MAmmoTH2
View on GitHub
Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]
☆146Oct 27, 2024Updated last year
mutonix / RefGPT
View on GitHub
☆98Mar 20, 2024Updated 2 years ago
weiyifan1023 / MenatQA
View on GitHub
Code and Data for EMNLP 2023 Paper "MenatQA: A New Dataset for Testing the Temporal Comprehension and Reasoning Abilities of Large Langu…
☆14Apr 7, 2025Updated last year
facebookresearch / Multi-IF
View on GitHub
The evaluation code for MultiIF multi-turn and multi-lingual instruction following
☆63Oct 29, 2024Updated last year
lmarena / arena-hard-auto
View on GitHub
Arena-Hard-Auto: An automatic LLM benchmark.
☆1,050Jun 21, 2025Updated last year