fannie1208 / W4SLinks
[COLM2025] "Weak-for-Strong: Training Weak Meta-Agent to Harness Strong Executors"
☆27Updated this week
Alternatives and similar repositories for W4S
Users that are interested in W4S are comparing it to the libraries listed below
Sorting:
- ☆64Updated 2 weeks ago
- [ACL'24] Chain of Thought (CoT) is significant in improving the reasoning abilities of large language models (LLMs). However, the correla…☆45Updated 4 months ago
- WideSearch: Benchmarking Agentic Broad Info-Seeking☆94Updated last month
- ☆46Updated 3 months ago
- On Memorization of Large Language Models in Logical Reasoning☆72Updated 6 months ago
- Code, benchmark and environment for "ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows"☆111Updated last month
- A curated list of Awesome-LLM-Ensemble papers for the survey "Harnessing Multiple Large Language Models: A Survey on LLM Ensemble"☆137Updated this week
- [ICLR 2025] Benchmarking Agentic Workflow Generation☆129Updated 7 months ago
- SSRL: Self-Search Reinforcement Learning☆145Updated last month
- [ICML 2025] Teaching Language Models to Critique via Reinforcement Learning☆114Updated 5 months ago
- Official implementation of MAS-GPT: Training LLMs to Build LLM-based Multi-Agent Systems☆63Updated 3 months ago
- [ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction☆82Updated 6 months ago
- ☆60Updated 10 months ago
- A comrephensive collection of learning from rewards in the post-training and test-time scaling of LLMs, with a focus on both reward model…☆56Updated 3 months ago
- RM-R1: Unleashing the Reasoning Potential of Reward Models☆137Updated 3 months ago
- ☆104Updated 10 months ago
- This is the implementation for the paper "LARGE LANGUAGE MODEL CASCADES WITH MIX- TURE OF THOUGHT REPRESENTATIONS FOR COST- EFFICIENT REA…☆26Updated last year
- Data and Code for EMNLP 2025 Findings Paper "MCTS-RAG: Enhancing Retrieval-Augmented Generation with Monte Carlo Tree Search"☆71Updated 3 months ago
- A curated list of awesome LLM Inference-Time Self-Improvement (ITSI, pronounced "itsy") papers from our recent survey: A Survey on Large …☆96Updated 9 months ago
- IKEA: Reinforced Internal-External Knowledge Synergistic Reasoning for Efficient Adaptive Search Agent☆65Updated 4 months ago
- Can Knowledge Editing Really Correct Hallucinations? (ICLR 2025)☆25Updated 2 months ago
- [ICML 2025] Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examples☆106Updated 2 months ago
- The official repo of "WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents"☆74Updated last week
- MPO: Boosting LLM Agents with Meta Plan Optimization (EMNLP 2025 Findings)☆71Updated last month
- [NeurIPS 2025 Spotlight] ReasonFlux-Coder: Open-Source LLM Coders with Co-Evolving Reinforcement Learning☆122Updated 3 weeks ago
- exploring whether LLMs perform case-based or rule-based reasoning☆29Updated last year
- This the implementation of LeCo☆31Updated 8 months ago
- RL Scaling and Test-Time Scaling (ICML'25)☆111Updated 8 months ago
- [EMNLP 2025] WebAgent-R1: Training Web Agents via End-to-End Multi-Turn Reinforcement Learning☆44Updated last month
- ☆127Updated last month