Ag2S1 / Sibyl-System
☆119Updated 8 months ago
Alternatives and similar repositories for Sibyl-System:
Users that are interested in Sibyl-System are comparing it to the libraries listed below
- ☆121Updated 11 months ago
- Code repo for "Agent Instructs Large Language Models to be General Zero-Shot Reasoners"☆107Updated 7 months ago
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks☆186Updated 3 weeks ago
- 🔧 Compare how Agent systems perform on several benchmarks. 📊🚀☆95Updated 6 months ago
- AWM: Agent Workflow Memory☆268Updated 3 months ago
- [ACL 2024] AutoAct: Automatic Agent Learning from Scratch for QA via Self-Planning☆221Updated 3 months ago
- An implemtation of Everyting of Thoughts (XoT).☆142Updated last year
- ☆114Updated 2 months ago
- My implementation of "Algorithm of Thoughts: Enhancing Exploration of Ideas in Large Language Models"☆98Updated last year
- Code for "Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate"☆141Updated 2 weeks ago
- Implementation of the Quiet-STAR paper (https://arxiv.org/pdf/2403.09629.pdf)☆53Updated 8 months ago
- Beating the GAIA benchmark with Transformers Agents. 🚀☆113Updated 2 months ago
- Code for the paper 🌳 Tree Search for Language Model Agents☆197Updated 9 months ago
- FireAct: Toward Language Agent Fine-tuning☆275Updated last year
- Benchmarking LLMs with Challenging Tasks from Real Users☆221Updated 6 months ago
- FuseAI Project☆85Updated 3 months ago
- An Analytical Evaluation Board of Multi-turn LLM Agents [NeurIPS 2024 Oral]☆309Updated 11 months ago
- Official implementation of paper "On the Diagram of Thought" (https://arxiv.org/abs/2409.10038)☆178Updated last month
- [ICLR 2025] A trinity of environments, tools, and benchmarks for general virtual agents☆201Updated 2 weeks ago
- ☆180Updated 3 months ago
- [NeurIPS 2024] Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?☆123Updated 8 months ago
- ☆155Updated 8 months ago
- augmented LLM with self reflection☆120Updated last year
- Toy implementation of Strawberry☆31Updated 7 months ago
- Code for Paper: Autonomous Evaluation and Refinement of Digital Agents [COLM 2024]☆135Updated 5 months ago
- EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language M…☆215Updated 6 months ago
- ☆102Updated 5 months ago
- ☆40Updated 9 months ago
- Source code for our paper: "SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals".☆66Updated 10 months ago
- ☆120Updated 7 months ago