ChiyuSONG / dynamics-of-instruction-tuningLinks
β17Updated 5 months ago
Alternatives and similar repositories for dynamics-of-instruction-tuning
Users that are interested in dynamics-of-instruction-tuning are comparing it to the libraries listed below
Sorting:
- Resources for our ACL 2023 paper: Distilling Script Knowledge from Large Language Models for Constrained Language Planningβ36Updated 2 years ago
- π©Ί A collection of ChatGPT evaluation reports on various bechmarks.β50Updated 2 years ago
- Code for our EMNLP-2023 paper: "Active Instruction Tuning: Improving Cross-Task Generalization by Training on Prompt Sensitive Tasks"β24Updated last year
- GSM-Plus: Data, Code, and Evaluation for Enhancing Robust Mathematical Reasoning in Math Word Problems.β62Updated last year
- Source code of "Reasons to Reject? Aligning Language Models with Judgments"β58Updated last year
- Implementation of ICML 23 Paper: Specializing Smaller Language Models towards Multi-Step Reasoning.β132Updated 2 years ago
- Do Large Language Models Know What They Donβt Know?β99Updated 9 months ago
- Code & Data for our Paper "Alleviating Hallucinations of Large Language Models through Induced Hallucinations"β68Updated last year
- The official repository for the paper "From Zero to Hero: Examining the Power of Symbolic Tasks in Instruction Tuning".β65Updated 2 years ago
- β53Updated last year
- β56Updated last year
- β34Updated last year
- Code and data for paper "Context-faithful Prompting for Large Language Models".β41Updated 2 years ago
- β41Updated last year
- Towards Systematic Measurement for Long Text Qualityβ37Updated 11 months ago
- Repo for outstanding paper@ACL 2023 "Do PLMs Know and Understand Ontological Knowledge?"β32Updated last year
- β31Updated 2 years ago
- Source codes and datasets for How well do Large Language Models perform in Arithmetic tasks?β57Updated 2 years ago
- [ICLR'24 spotlight] Tool-Augmented Reward Modelingβ51Updated 2 months ago
- [ICLR 2024] Evaluating Large Language Models at Evaluating Instruction Followingβ129Updated last year
- Lightweight tool to identify Data Contamination in LLMs evaluationβ51Updated last year
- [ICLR24] The open-source repo of THU-KEG's KoLA benchmark.β51Updated last year
- Grade-School Math with Irrelevant Context (GSM-IC) benchmark is an arithmetic reasoning dataset built upon GSM8K, by adding irrelevant seβ¦β60Updated 2 years ago
- Supporting code for ReCEval paperβ29Updated 11 months ago
- πΌ Official implementation of Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-Expertsβ40Updated 10 months ago
- The implementation for our paper, "Improving Simultaneous Machine Translation with Monolingual Data," accepted to AAAI 2023. πβ12Updated 2 years ago
- Benchmarking Complex Instruction-Following with Multiple Constraints Composition (NeurIPS 2024 Datasets and Benchmarks Track)β91Updated 6 months ago
- First explanation metric (diagnostic report) for text generation evaluationβ62Updated 5 months ago
- β30Updated 7 months ago
- Official repository for MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models [NeurIPS 2024]β74Updated 9 months ago