dannyallover / llm_forecastingLinks
Forecasting with LLMs
☆55Updated last year
Alternatives and similar repositories for llm_forecasting
Users that are interested in llm_forecasting are comparing it to the libraries listed below
Sorting:
- ☆56Updated last year
- ☆80Updated last year
- A lightweight library for Bayesian analysis of LLM evals (ICML 2025 Spotlight Position Paper)☆21Updated 7 months ago
- Public repository containing METR's DVC pipeline for eval data analysis☆183Updated last week
- HypotheSAEs: hypothesizing interpretable relationships in text datasets using sparse autoencoders. https://arxiv.org/abs/2502.04382☆71Updated 2 months ago
- ☆214Updated 3 weeks ago
- A DSPy-based implementation of the tree of thoughts method (Yao et al., 2023) for generating persuasive arguments☆97Updated 3 months ago
- We develop benchmarks and analysis tools to evaluate the causal reasoning abilities of LLMs.☆136Updated last year
- Forecastbench Datasets, updated nightly☆22Updated this week
- ☆105Updated 5 months ago
- ☆130Updated 3 months ago
- ☆194Updated 6 months ago
- ☆43Updated last year
- Contains random samples referenced in the paper "Sleeper Agents: Training Robustly Deceptive LLMs that Persist Through Safety Training".☆124Updated last year
- An attribution library for LLMs☆46Updated last year
- ☆62Updated last month
- The Prism Alignment Project☆87Updated last year
- Inference-time scaling for LLMs-as-a-judge.☆325Updated 2 months ago
- This is the official repository for HypoGeniC (Hypothesis Generation in Context) and HypoRefine, which are automated, data-driven tools t…☆101Updated 2 months ago
- ☆50Updated last year
- ☆14Updated 11 months ago
- Causal DAG Extraction from Text (DEFT)☆66Updated last year
- Governance of the Commons Simulation (GovSim)☆64Updated last year
- Examining how large language models (LLMs) perform across various synthetic regression tasks when given (input, output) examples in their…☆160Updated 3 months ago
- Functional Benchmarks and the Reasoning Gap☆89Updated last year
- Extending Conformal Prediction to LLMs☆69Updated last year
- Sphynx Hallucination Induction☆52Updated 11 months ago
- Learning to route instances for Human vs AI Feedback (ACL Main '25)☆26Updated 5 months ago
- Open source interpretability artefacts for R1.☆167Updated 9 months ago
- ☆64Updated last week