forecastingresearch / forecastbench-datasetsLinks
Forecastbench Datasets, updated nightly
β13Updated last week
Alternatives and similar repositories for forecastbench-datasets
Users that are interested in forecastbench-datasets are comparing it to the libraries listed below
Sorting:
- Exploration using DSPy to optimize modules to maximize performance on the OpenToM datasetβ18Updated last year
- ππ§ Easily experiment with popular language agents across diverse reasoning/decision-making benchmarks!β54Updated last month
- Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generationβ44Updated last year
- β28Updated 5 months ago
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"β61Updated 8 months ago
- β56Updated 2 months ago
- A DSPy-based implementation of the tree of thoughts method (Yao et al., 2023) for generating persuasive argumentsβ88Updated 11 months ago
- Source code for our paper: "SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals".β69Updated last year
- A library for benchmarking the Long Term Memory and Continual learning capabilities of LLM based agents. With all the tests and code youβ¦β76Updated 8 months ago
- accompanying material for sleep-time compute paperβ107Updated 4 months ago
- A framework for pitting LLMs against each other in an evolving library of games ββ33Updated 4 months ago
- β54Updated 9 months ago
- Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments (EMNLP'2024)β37Updated 8 months ago
- Official repo for Learning to Reason for Long-Form Story Generationβ68Updated 4 months ago
- β32Updated last year
- β90Updated 7 months ago
- A virtual environment for developing and evaluating automated scientific discovery agents.β181Updated 5 months ago
- β13Updated 4 months ago
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)β92Updated 7 months ago
- Repository for the paper Stream of Search: Learning to Search in Languageβ150Updated 7 months ago
- β43Updated 10 months ago
- Resources for our paper: "EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms"β122Updated 10 months ago
- ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems. (EMNLP 2024 Demo)β84Updated 5 months ago
- Interaction-first method for generating demonstrations for web-agents on any websiteβ47Updated 4 months ago
- β11Updated 9 months ago
- The Library for LLM-based multi-agent applicationsβ90Updated last month
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignmentβ60Updated last year
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Modelsβ¦β36Updated last year
- β67Updated 5 months ago
- Lean implementation of various multi-agent LLM methods, including Iteration of Thought (IoT)β120Updated 6 months ago