dannyallover / llm_forecastingLinks
Forecasting with LLMs
☆47Updated last year
Alternatives and similar repositories for llm_forecasting
Users that are interested in llm_forecasting are comparing it to the libraries listed below
Sorting:
- ☆43Updated 7 months ago
- Data and code for the Corr2Cause paper (ICLR 2024)☆105Updated last year
- Causal DAG Extraction from Text (DEFT)☆66Updated 5 months ago
- ☆48Updated 3 weeks ago
- ☆87Updated 2 months ago
- ☆32Updated 3 weeks ago
- Governance of the Commons Simulation (GovSim)☆51Updated 5 months ago
- Learning to route instances for Human vs AI Feedback (ACL 2025 Main)☆23Updated last month
- ☆27Updated last year
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"☆57Updated 6 months ago
- ☆92Updated 3 weeks ago
- Evaluation of neuro-symbolic engines☆35Updated 10 months ago
- datasets from the paper "Towards Understanding Sycophancy in Language Models"☆81Updated last year
- Evaluating the Moral Beliefs Encoded in LLMs☆26Updated 6 months ago
- Run SWE-bench evaluations remotely☆21Updated last month
- How to create rational LLM-based agents? Using game-theoretic workflows!☆72Updated 2 weeks ago
- Evaluate uncertainty, calibration, accuracy, and fairness of LLMs on real-world survey data!☆22Updated 2 months ago
- We develop benchmarks and analysis tools to evaluate the causal reasoning abilities of LLMs.☆119Updated last year
- Probabilistic programming with large language models☆121Updated 2 weeks ago
- A distributed agent orchestration framework for market agents☆102Updated this week
- Code for 'Emergent Analogical Reasoning in Large Language Models'☆50Updated last year
- ☆69Updated last year
- Code for☆27Updated 6 months ago
- Code for "Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs"☆51Updated 4 months ago
- Public repository containing METR's DVC pipeline for eval data analysis☆70Updated 2 months ago
- Repository of paper "How Likely Do LLMs with CoT Mimic Human Reasoning?"☆22Updated 4 months ago
- Functional Benchmarks and the Reasoning Gap☆87Updated 8 months ago
- A DSPy-based implementation of the tree of thoughts method (Yao et al., 2023) for generating persuasive arguments☆82Updated 8 months ago
- 🤝 The code for "Can Large Language Model Agents Simulate Human Trust Behaviors?"☆84Updated 2 months ago
- [ICML 2025] Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examples☆95Updated 2 weeks ago