WooooDyy / Self-PolishLinks
Codes for the EMNLP 2023 Findings paper "Self-Polish: Enhance Reasoning in Large Language Models via Problem Refining" by Zhiheng Xi, Senjie Jin, Yuhao Zhou, Rui Zheng, Songyang Gao, Tao Gui, Qi Zhang and Xuanjing Huang.
☆30Updated 2 years ago
Alternatives and similar repositories for Self-Polish
Users that are interested in Self-Polish are comparing it to the libraries listed below
Sorting:
- A dataset of LLM-generated chain-of-thought steps annotated with mistake location.☆81Updated last year
- ToolBench, an evaluation suite for LLM tool manipulation capabilities.☆160Updated last year
- augmented LLM with self reflection☆132Updated last year
- Code for ICLR 2024 paper "CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets"☆58Updated last year
- [ACL 2024] AutoAct: Automatic Agent Learning from Scratch for QA via Self-Planning☆229Updated 8 months ago
- The repository for ACL 2024 paper "TimeBench: A Comprehensive Evaluation of Temporal Reasoning Abilities in Large Language Models"☆32Updated last year
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"☆54Updated last year
- Meta-CoT: Generalizable Chain-of-Thought Prompting in Mixed-task Scenarios with Large Language Models☆97Updated last year
- ☆122Updated last year
- Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]☆148Updated 10 months ago
- [COLM'24] Corex: Pushing the Boundaries of Complex Reasoning through Multi-Model Collaboration☆29Updated 11 months ago
- Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments (EMNLP'2024)☆37Updated 8 months ago
- Scripts for generating synthetic finetuning data for reducing sycophancy.☆116Updated 2 years ago
- ☆68Updated 2 years ago
- [COLING 2025] ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios☆69Updated 4 months ago
- RL Scaling and Test-Time Scaling (ICML'25)☆111Updated 7 months ago
- Syntax Error-Free and Generalizable Tool Use for LLMs via Finite-State Decoding☆27Updated last year
- A simple GPT-based evaluation tool for multi-aspect, interpretable assessment of LLMs.☆87Updated last year
- [NAACL 2025] The official implementation of paper "Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language M…☆29Updated last year
- ☆74Updated last year
- [ICLR'24 spotlight] Tool-Augmented Reward Modeling☆51Updated 3 months ago
- The code for the paper: "Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models"☆54Updated last year
- FuseAI Project☆87Updated 7 months ago
- Implementation of the paper: "Answering Questions by Meta-Reasoning over Multiple Chains of Thought"☆96Updated last year
- [ACL 2025] AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistant☆40Updated 9 months ago
- Scalable Meta-Evaluation of LLMs as Evaluators☆42Updated last year
- [EMNLP'24] LongHeads: Multi-Head Attention is Secretly a Long Context Processor☆30Updated last year
- This is the official implementation of "Progressive-Hint Prompting Improves Reasoning in Large Language Models"☆209Updated last year
- Conifer: Improving Complex Constrained Instruction-Following Ability of Large Language Models☆88Updated last year
- The official repository of "ChatCoT: Tool-Augmented Chain-of-Thought Reasoning on Chat-based Large Language Models"☆44Updated 2 years ago