allenai / everyday-things
☆17Updated last year
Alternatives and similar repositories for everyday-things:
Users that are interested in everyday-things are comparing it to the libraries listed below
- Code of ICLR paper: https://openreview.net/forum?id=-cqvvvb-NkI☆94Updated 2 years ago
- ☆23Updated 6 months ago
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"☆49Updated 2 months ago
- Supporting code for ReCEval paper☆28Updated 5 months ago
- SILO Language Models code repository☆81Updated last year
- A unified benchmark for math reasoning☆87Updated 2 years ago
- Byte-sized text games for code generation tasks on virtual environments☆19Updated 7 months ago
- Evaluation on Logical Reasoning and Abstract Reasoning Challenges☆23Updated last year
- Code for the arXiv paper: "LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond"☆59Updated last month
- 👻 Code and benchmark for our EMNLP 2023 paper - "FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions"☆52Updated 9 months ago
- Source code and data for The Magic of IF: Investigating Causal Reasoning Abilities in Large Language Models of Code (Findings of ACL 2023…☆29Updated last year
- Repo for: When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment☆38Updated last year
- Code and Data for the NAACL 24 paper: MacGyver: Are Large Language Models Creative Problem Solvers?☆25Updated 11 months ago
- ☆45Updated last year
- Apps built using Inspired Cognition's Critique.☆58Updated last year
- OpenPI dataset for tracking entities in open domain procedural text☆22Updated 6 months ago
- [ICLR 2023] Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners☆113Updated 5 months ago
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆46Updated last year
- ☆35Updated 2 years ago
- Documentation effort for the BookCorpus dataset☆33Updated 3 years ago
- ☆44Updated 3 months ago
- Neural models of common sense. 🤖☆95Updated last year
- [EACL 2023] CoTEVer: Chain of Thought Prompting Annotation Toolkit for Explanation Verification☆38Updated last year
- ☆26Updated last month
- This repo contains code for the paper "Psychologically-informed chain-of-thought prompts for metaphor understanding in large language mod…☆14Updated last year
- Code for paper "Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs"☆28Updated 2 years ago
- ☆26Updated 2 years ago
- Repository for "I am a Strange Dataset: Metalinguistic Tests for Language Models"☆41Updated last year
- IntructIR, a novel benchmark specifically designed to evaluate the instruction following ability in information retrieval models. Our foc…☆31Updated 8 months ago
- Code for our EMNLP '22 paper "Fixing Model Bugs with Natural Language Patches"☆19Updated 2 years ago