multimodal-art-projection / I-SHEEP
I-SHEEP: Iterative Self-enHancEmEnt Paradigm of LLMs through Self-Instruct and Self-Assessment
☆16Updated 4 months ago
Alternatives and similar repositories for I-SHEEP
Users that are interested in I-SHEEP are comparing it to the libraries listed below
Sorting:
- Official implementation of the paper "From Complex to Simple: Enhancing Multi-Constraint Complex Instruction Following Ability of Large L…☆48Updated 10 months ago
- Official repository for paper "Weak-to-Strong Extrapolation Expedites Alignment"☆74Updated 11 months ago
- ☆49Updated last year
- [EMNLP 2024] Source code for the paper "Learning Planning-based Reasoning with Trajectory Collection and Process Rewards Synthesizing".☆77Updated 4 months ago
- [ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning☆151Updated 8 months ago
- Reformatted Alignment☆114Updated 7 months ago
- Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]☆140Updated 6 months ago
- Code and data for the paper "Can Large Language Models Understand Real-World Complex Instructions?"(AAAI2024)☆48Updated last year
- SimpleDeepSearcher: Deep Information Seeking via Web-Powered Reasoning Trajectory Synthesis☆43Updated 3 weeks ago
- [ICLR'25] Data and code for our paper "Why Does the Effective Context Length of LLMs Fall Short?"☆75Updated 5 months ago
- Large Language Models Can Self-Improve in Long-context Reasoning☆69Updated 5 months ago
- Code for ICLR 2024 paper "CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets"☆56Updated 11 months ago
- ☆69Updated last year
- ☆59Updated 8 months ago
- Unofficial implementation of AlpaGasus☆91Updated last year
- ☆151Updated 5 months ago
- Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling☆101Updated 3 months ago
- Source code of "Reasons to Reject? Aligning Language Models with Judgments"☆58Updated last year
- We introduce ScaleQuest, a scalable, novel and cost-effective data synthesis method to unleash the reasoning capability of LLMs.☆62Updated 6 months ago
- Official repository for paper "TableBench: A Comprehensive and Complex Benchmark for Table Question Answering"☆54Updated last week
- BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval☆114Updated 3 weeks ago
- The code for the paper: "Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models"☆54Updated 10 months ago
- [EMNLP 2024 (Oral)] Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA☆126Updated 6 months ago
- 🐋 An unofficial implementation of Self-Alignment with Instruction Backtranslation.☆139Updated last week
- Code for "Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate"☆146Updated 3 weeks ago
- [ACL 2024] FollowBench: A Multi-level Fine-grained Constraints Following Benchmark for Large Language Models☆98Updated 3 weeks ago
- Code implementation of synthetic continued pretraining☆109Updated 4 months ago
- [COLING 2025] ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios☆66Updated this week
- [ICLR'24 Spotlight] "Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts"☆69Updated last year
- Official github repo for AutoDetect, an automated weakness detection framework for LLMs.☆42Updated 10 months ago