Ag2S1 / Sibyl-System
☆114Updated 6 months ago
Alternatives and similar repositories for Sibyl-System:
Users that are interested in Sibyl-System are comparing it to the libraries listed below
- An implemtation of Everyting of Thoughts (XoT).☆139Updated 11 months ago
- [ACL 2024] AUTOACT: Automatic Agent Learning from Scratch for QA via Self-Planning☆206Updated last month
- Code repo for "Agent Instructs Large Language Models to be General Zero-Shot Reasoners"☆100Updated 5 months ago
- ☆120Updated 8 months ago
- 🌍 Repository for "AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agent", ACL'24 Best Resource Pap…☆145Updated 2 months ago
- Official implementation of paper "On the Diagram of Thought" (https://arxiv.org/abs/2409.10038)☆172Updated 4 months ago
- Resources for our paper: "EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms"☆81Updated 4 months ago
- ☆98Updated 2 months ago
- EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language M…☆203Updated 3 months ago
- 🦀️ CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents. https://crab.camel-ai.org/☆213Updated 2 months ago
- Codebase accompanying the Summary of a Haystack paper.☆74Updated 5 months ago
- Code for the paper 🌳 Tree Search for Language Model Agents☆178Updated 6 months ago
- My implementation of "Algorithm of Thoughts: Enhancing Exploration of Ideas in Large Language Models"☆97Updated last year
- AWM: Agent Workflow Memory☆241Updated 2 weeks ago
- LongEmbed: Extending Embedding Models for Long Context Retrieval (EMNLP 2024)☆128Updated 3 months ago
- WebLINX is a benchmark for building web navigation agents with conversational capabilities☆141Updated last week
- Code for Husky, an open-source language agent that solves complex, multi-step reasoning tasks. Husky v1 addresses numerical, tabular and …☆336Updated 8 months ago
- [NeurIPS 2024] Agent Planning with World Knowledge Model☆110Updated 2 months ago
- An Analytical Evaluation Board of Multi-turn LLM Agents [NeurIPS 2024 Oral]☆281Updated 9 months ago
- Benchmarking LLMs with Challenging Tasks from Real Users☆215Updated 3 months ago
- 🔧 Compare how Agent systems perform on several benchmarks. 📊🚀☆71Updated 3 months ago
- augmented LLM with self reflection☆111Updated last year
- ☆108Updated 3 weeks ago
- A pipeline for LLM knowledge distillation☆89Updated 3 weeks ago
- ☆108Updated 5 months ago
- Code for Paper: Autonomous Evaluation and Refinement of Digital Agents [COLM 2024]☆125Updated 2 months ago
- A simple unified framework for evaluating LLMs☆197Updated 2 weeks ago
- Beating the GAIA benchmark with Transformers Agents. 🚀☆87Updated this week
- ToolBench, an evaluation suite for LLM tool manipulation capabilities.☆150Updated 11 months ago