allenai / discoverybench
Discovering Data-driven Hypotheses in the Wild
☆54Updated 2 months ago
Alternatives and similar repositories for discoverybench:
Users that are interested in discoverybench are comparing it to the libraries listed below
- [EMNLP 2024] A Retrieval Benchmark for Scientific Literature Search☆69Updated last month
- ☆116Updated 3 months ago
- ☆118Updated last week
- Code for the ICLR 2024 paper "How to catch an AI liar: Lie detection in black-box LLMs by asking unrelated questions"☆64Updated 7 months ago
- The Prism Alignment Project☆63Updated 9 months ago
- Repository for the paper Stream of Search: Learning to Search in Language☆125Updated 5 months ago
- datasets from the paper "Towards Understanding Sycophancy in Language Models"☆69Updated last year
- Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Fl…☆64Updated 5 months ago
- Dataset and evaluation suite enabling LLM instruction-following for scientific literature understanding.☆31Updated last month
- A benchmark that challenges language models to code solutions for scientific problems☆97Updated this week
- ☆67Updated 5 months ago
- ☆139Updated this week
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆156Updated 3 months ago
- Repository for NPHardEval, a quantified-dynamic benchmark of LLMs☆51Updated 10 months ago
- ☆27Updated 4 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆53Updated 5 months ago
- Official repository for "Scaling Retrieval-Based Langauge Models with a Trillion-Token Datastore".☆155Updated this week
- Code/data for MARG (multi-agent review generation)☆38Updated 2 months ago
- The GitHub repo for Goal Driven Discovery of Distributional Differences via Language Descriptions☆68Updated last year
- Datasets collection and preprocessings framework for NLP extreme multitask learning☆173Updated 3 weeks ago
- PyTorch library for Active Fine-Tuning☆53Updated 3 weeks ago
- Open source replication of Anthropic's Crosscoders for Model Diffing☆31Updated 3 months ago
- CausalGym: Benchmarking causal interpretability methods on linguistic tasks☆40Updated 2 months ago
- ☆38Updated 9 months ago
- ☆40Updated 3 months ago
- Code for Zero-Shot Tokenizer Transfer☆121Updated 2 weeks ago
- [ACL 2024] <Large Language Models for Automated Open-domain Scientific Hypotheses Discovery>. It has also received the best poster award …☆37Updated 3 months ago
- ☆52Updated last year
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers☆124Updated 10 months ago
- Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"☆67Updated 3 months ago