forecastingresearch / forecastbench-datasetsLinks
Forecastbench Datasets, updated nightly
☆18Updated this week
Alternatives and similar repositories for forecastbench-datasets
Users that are interested in forecastbench-datasets are comparing it to the libraries listed below
Sorting:
- Exploration using DSPy to optimize modules to maximize performance on the OpenToM dataset☆23Updated last year
- A DSPy-based implementation of the tree of thoughts method (Yao et al., 2023) for generating persuasive arguments☆92Updated last month
- ☆43Updated last year
- Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generation☆48Updated last year
- 🔔🧠 Easily experiment with popular language agents across diverse reasoning/decision-making benchmarks!☆54Updated 4 months ago
- ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems. (EMNLP 2024 Demo)☆88Updated this week
- Source code for our paper: "SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals".☆69Updated last year
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"☆63Updated 11 months ago
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)☆91Updated 9 months ago
- ☆15Updated 6 months ago
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…☆37Updated last year
- ☆11Updated last year
- A framework for pitting LLMs against each other in an evolving library of games ⚔☆34Updated 6 months ago
- An attribution library for LLMs☆45Updated last year
- Source code of "How to Correctly do Semantic Backpropagation on Language-based Agentic Systems" 🤖☆76Updated 11 months ago
- A library for benchmarking the Long Term Memory and Continual learning capabilities of LLM based agents. With all the tests and code you…☆79Updated 10 months ago
- Official repo for Learning to Reason for Long-Form Story Generation☆72Updated 6 months ago
- ☆40Updated 10 months ago
- The Library for LLM-based multi-agent applications☆91Updated 3 months ago
- accompanying material for sleep-time compute paper☆117Updated 6 months ago
- ☆103Updated 10 months ago
- DSPy program/pipeline inspector widget for Jupyter/VSCode Notebooks.☆42Updated last year
- Plug in and Play implementation of "Certified Reasoning with Language Models" that elevates model reasoning by 40%☆15Updated 2 years ago
- ☆86Updated last year
- [ICLR'25] ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery☆106Updated 2 months ago
- EcoAssistant: using LLM assistant more affordably and accurately☆133Updated last year
- Explore the use of DSPy for extracting features from PDFs 🔎☆48Updated last year
- Based on the tree of thoughts paper☆48Updated 2 years ago
- Interaction-first method for generating demonstrations for web-agents on any website☆50Updated 6 months ago
- LMQL implementation of tree of thoughts☆34Updated last year