allenai / everyday-things
☆17Updated last year
Alternatives and similar repositories for everyday-things:
Users that are interested in everyday-things are comparing it to the libraries listed below
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"☆54Updated 4 months ago
- A unified benchmark for math reasoning☆87Updated 2 years ago
- ☆23Updated 7 months ago
- Code of ICLR paper: https://openreview.net/forum?id=-cqvvvb-NkI☆94Updated 2 years ago
- SILO Language Models code repository☆81Updated last year
- Few-shot Learning with Auxiliary Data☆27Updated last year
- This repo contains code for the paper "Psychologically-informed chain-of-thought prompts for metaphor understanding in large language mod…☆14Updated last year
- Apps built using Inspired Cognition's Critique.☆58Updated 2 years ago
- 👻 Code and benchmark for our EMNLP 2023 paper - "FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions"☆54Updated 10 months ago
- [ICLR 2023] Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners☆114Updated 7 months ago
- Byte-sized text games for code generation tasks on virtual environments☆19Updated 9 months ago
- OpenPI dataset for tracking entities in open domain procedural text☆22Updated 8 months ago
- IntructIR, a novel benchmark specifically designed to evaluate the instruction following ability in information retrieval models. Our foc…☆31Updated 10 months ago
- Evaluation on Logical Reasoning and Abstract Reasoning Challenges☆26Updated this week
- Supporting code for ReCEval paper☆28Updated 7 months ago
- Neural models of common sense. 🤖☆96Updated last year
- Repo for: When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment☆38Updated last year
- Code for the arXiv paper: "LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond"☆59Updated 2 months ago
- Code, datasets, models for the paper "Automatic Evaluation of Attribution by Large Language Models"☆56Updated last year
- Implementation of the model: "Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models" in PyTorch☆30Updated last week
- The GitHub repo for Goal Driven Discovery of Distributional Differences via Language Descriptions☆69Updated 2 years ago
- CausalGym: Benchmarking causal interpretability methods on linguistic tasks☆41Updated 4 months ago
- Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments (EMNLP'2024)☆36Updated 3 months ago
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆48Updated last year
- ☆90Updated 2 months ago
- Code and Data for the NAACL 24 paper: MacGyver: Are Large Language Models Creative Problem Solvers?☆27Updated last year
- Code for our EMNLP '22 paper "Fixing Model Bugs with Natural Language Patches"☆19Updated 2 years ago
- Repository for Skill Set Optimization☆12Updated 9 months ago
- ReBase: Training Task Experts through Retrieval Based Distillation☆29Updated 2 months ago
- ☆36Updated 2 years ago