forecastingresearch / forecastbench-datasetsLinks
Forecastbench Datasets, updated nightly
☆11Updated this week
Alternatives and similar repositories for forecastbench-datasets
Users that are interested in forecastbench-datasets are comparing it to the libraries listed below
Sorting:
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆57Updated 9 months ago
- Code and Data for "Language Modeling with Editable External Knowledge"☆33Updated 11 months ago
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)☆90Updated 4 months ago
- This repository contains popular code generation frameworks such as MapCoder, CodeSIM.☆51Updated last month
- Minimal implementation of the Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models paper (ArXiv 20232401.01335)☆29Updated last year
- [ICML 2025] Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examples☆89Updated last week
- Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generation☆42Updated last year
- Exploration using DSPy to optimize modules to maximize performance on the OpenToM dataset☆16Updated last year
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…☆34Updated last year
- Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments (EMNLP'2024)☆36Updated 5 months ago
- [ICLR 2025] ELICIT: LLM Augmentation Via External In-context Capability☆11Updated 2 months ago
- ☆18Updated last month
- ☆65Updated 2 months ago
- LMQL implementation of tree of thoughts☆34Updated last year
- Code/data for MARG (multi-agent review generation)☆43Updated 6 months ago
- A dynamic forecasting benchmark for LLMs☆21Updated this week
- Code and Data for "MIRAI: Evaluating LLM Agents for Event Forecasting"☆65Updated 11 months ago
- ☆10Updated last month
- ☆20Updated last month
- EMNLP 2024 "Re-reading improves reasoning in large language models". Simply repeating the question to get bidirectional understanding for…☆26Updated 5 months ago
- This is a pip package implementing Reinforcement Learning algorithms in non-stationary environments supported by the OpenAI Gym toolkit.☆15Updated 11 months ago
- The official repository of the OpenToM dataset☆23Updated 4 months ago
- Metacognitive Prompting Improves Understanding in Large Language Models (NAACL 2024)☆34Updated last year
- ☆21Updated 3 months ago
- Repository for NPHardEval, a quantified-dynamic benchmark of LLMs☆54Updated last year
- Download, parse, and filter data from Phil Papers. Data-ready for The-Pile.☆16Updated last year
- Agent Skill Induction: "Inducing Programmatic Skills for Agentic Tasks"☆20Updated last month
- Automated Design of Agentic Systems☆10Updated 9 months ago
- ☆59Updated this week
- Code for the ICLR 2024 paper "How to catch an AI liar: Lie detection in black-box LLMs by asking unrelated questions"☆70Updated 11 months ago