Spico197 / Humback

🐋 An unofficial implementation of Self-Alignment with Instruction Backtranslation.

☆139

Alternatives and similar repositories for Humback:

Users that are interested in Humback are comparing it to the libraries listed below

tianyi-lab / Superfiltering
[ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning
☆149Updated 7 months ago
icip-cas / awesome-auto-alignment
Collection of papers for scalable automated alignment.
☆89Updated 6 months ago
thu-coai / ComplexBench
Benchmarking Complex Instruction-Following with Multiple Constraints Composition (NeurIPS 2024 Datasets and Benchmarks Track)
☆81Updated 2 months ago
Junjie-Ye / ToolEyes
[COLING 2025] ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios
☆65Updated 5 months ago
thu-coai / CritiqueLLM
☆143Updated 10 months ago
gpt4life / alpagasus
Unofficial implementation of AlpaGasus
☆91Updated last year
mtbench101 / mt-bench-101
[ACL 2024] MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues
☆85Updated 9 months ago
CASIA-LM / MoDS
☆140Updated last year
pldlgb / nuggets
☆81Updated last year
Abbey4799 / CELLO
Code and data for the paper "Can Large Language Models Understand Real-World Complex Instructions?"(AAAI2024)
☆48Updated last year
Felixgithub2017 / MMCU
MEASURING MASSIVE MULTITASK CHINESE UNDERSTANDING
☆87Updated last year
ZitongYang / Synthetic_Continued_Pretraining
Code implementation of synthetic continued pretraining
☆107Updated 4 months ago
GAIR-NLP / auto-j
Generative Judge for Evaluating Alignment
☆236Updated last year
OFA-Sys / InsTag
InsTag: A Tool for Data Analysis in LLM Supervised Fine-tuning
☆256Updated last year
YJiangcm / FollowBench
[ACL 2024] FollowBench: A Multi-level Fine-grained Constraints Following Benchmark for Large Language Models
☆98Updated 2 weeks ago
OpenMOSS / HalluQA
Dataset and evaluation script for "Evaluating Hallucinations in Chinese Large Language Models"
☆125Updated 11 months ago
nick7nlp / Counting-Stars
Counting-Stars (★)
☆82Updated 8 months ago
l294265421 / alpaca-rlhf
Finetuning LLaMA with RLHF (Reinforcement Learning with Human Feedback) based on DeepSpeed Chat
☆115Updated last year
MikeGu721 / XiezhiBenchmark
☆97Updated last year
princeton-nlp / QuRating
[ICML 2024] Selecting High-Quality Data for Training Language Models
☆169Updated 10 months ago
yinzhangyue / SelfAware
Do Large Language Models Know What They Don’t Know?
☆94Updated 5 months ago
FranxYao / FlanT5-CoT-Specialization
Implementation of ICML 23 Paper: Specializing Smaller Language Models towards Multi-Step Reasoning.
☆130Updated last year
IronBeliever / CaR
Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimation
☆78Updated 5 months ago
MARIO-Math-Reasoning / MARIO_EVAL
☆45Updated 2 months ago
shizhediao / R-Tuning
[NAACL 2024 Outstanding Paper] Source code for the NAACL 2024 paper entitled "R-Tuning: Instructing Large Language Models to Say 'I Don't…
☆110Updated 9 months ago
zexuanqiu / CLongEval
CLongEval: A Chinese Benchmark for Evaluating Long-Context Large Language Models
☆40Updated last year
princeton-nlp / CEPE
[ACL 2024] Long-Context Language Modeling with Parallel Encodings
☆154Updated 10 months ago
pillowsofwind / Knowledge-Conflicts-Survey
[EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"
☆114Updated 7 months ago
OFA-Sys / gsm8k-ScRel
Codes and Data for Scaling Relationship on Learning Mathematical Reasoning with Large Language Models
☆260Updated 7 months ago
fanqiwan / Explore-Instruct
EMNLP'2023: Explore-Instruct: Enhancing Domain-Specific Instruction Coverage through Active Exploration
☆35Updated last year