dannyallover / llm_forecastingLinks
Forecasting with LLMs
☆55Updated last year
Alternatives and similar repositories for llm_forecasting
Users that are interested in llm_forecasting are comparing it to the libraries listed below
Sorting:
- ☆54Updated last year
- Causal DAG Extraction from Text (DEFT)☆66Updated 11 months ago
- ☆44Updated last year
- ☆79Updated last year
- ☆104Updated 4 months ago
- We develop benchmarks and analysis tools to evaluate the causal reasoning abilities of LLMs.☆134Updated last year
- [ICML 2025] HypotheSAEs: Hypothesizing interpretable relationships in text datasets using sparse autoencoders. https://arxiv.org/abs/2502…☆66Updated last month
- ☆124Updated 2 months ago
- ☆207Updated last week
- A lightweight library for Bayesian analysis of LLM evals (ICML 2025 Spotlight Position Paper)☆21Updated 6 months ago
- ☆13Updated 10 months ago
- Inference-time scaling for LLMs-as-a-judge.☆317Updated last month
- ☆62Updated last year
- Forecastbench Datasets, updated nightly☆20Updated this week
- Examining how large language models (LLMs) perform across various synthetic regression tasks when given (input, output) examples in their…☆157Updated 2 months ago
- Data and code for the Corr2Cause paper (ICLR 2024)☆111Updated last year
- Functional Benchmarks and the Reasoning Gap☆90Updated last year
- An attribution library for LLMs☆46Updated last year
- Evaluation of neuro-symbolic engines☆40Updated last year
- A DSPy-based implementation of the tree of thoughts method (Yao et al., 2023) for generating persuasive arguments☆94Updated 2 months ago
- A dynamic forecasting benchmark for LLMs☆48Updated last week
- ⚖️ Awesome LLM Judges ⚖️☆146Updated 7 months ago
- Based on the tree of thoughts paper☆48Updated 2 years ago
- Governance of the Commons Simulation (GovSim)☆62Updated 11 months ago
- Repository for paper Decrypting Cryptic Crosswords☆10Updated 3 years ago
- An agent orchestration framework for economic agents☆109Updated 4 months ago
- A toolkit for describing model features and intervening on those features to steer behavior.☆223Updated last week
- Extending Conformal Prediction to LLMs☆68Updated last year
- Open source interpretability artefacts for R1.☆165Updated 8 months ago
- Contains random samples referenced in the paper "Sleeper Agents: Training Robustly Deceptive LLMs that Persist Through Safety Training".☆122Updated last year