EternityYW / TRAM-Benchmark
TRAM: Benchmarking Temporal Reasoning for Large Language Models (Findings of ACL 2024)
☆23Updated 7 months ago
Alternatives and similar repositories for TRAM-Benchmark:
Users that are interested in TRAM-Benchmark are comparing it to the libraries listed below
- The repository for ACL 2024 paper "TimeBench: A Comprehensive Evaluation of Temporal Reasoning Abilities in Large Language Models"☆29Updated 7 months ago
- Code and Data for NeurIPS2021 Paper "A Dataset for Answering Time-Sensitive Questions"☆66Updated 2 years ago
- Resources for Retrieval Augmentation for Commonsense Reasoning: A Unified Approach. EMNLP 2022.☆21Updated 2 years ago
- Methods and evaluation for aligning language models temporally☆27Updated 11 months ago
- [ICLR'24 Spotlight] "Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts"☆65Updated 10 months ago
- ☆36Updated 10 months ago
- Code and data for paper "Context-faithful Prompting for Large Language Models".☆39Updated last year
- Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement (EMNLP 2024 Main Conference)☆52Updated 4 months ago
- ☆25Updated last year
- AbstainQA, ACL 2024☆25Updated 4 months ago
- [ICML 2023] Code for our paper “Compositional Exemplars for In-context Learning”.☆97Updated last year
- ☆37Updated last year
- ☆40Updated last year
- [EMNLP 2023] MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questions☆106Updated 5 months ago
- [ICLR 2023] Code for our paper "Selective Annotation Makes Language Models Better Few-Shot Learners"☆108Updated last year
- Data and code for ACL 2022 paper "MultiHiertt: Numerical Reasoning over Multi Hierarchical Tabular and Textual Data"☆42Updated 3 months ago
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"☆34Updated last year
- Repo for outstanding paper@ACL 2023 "Do PLMs Know and Understand Ontological Knowledge?"☆30Updated last year
- Source code for InBedder, an instruction-following text embedder☆24Updated 4 months ago
- Code for RL4F: Generating Natural Language Feedback with Reinforcement Learning for Repairing Model Outputs. ACL 2023.☆63Updated 2 months ago
- ☆31Updated last year
- ☆14Updated last year
- Code and data for the FACTOR paper☆44Updated last year
- Code for reproducing the ACL'23 paper: Don't Generate, Discriminate: A Proposal for Grounding Language Models to Real-World Environments☆73Updated last year
- Github repository for "FELM: Benchmarking Factuality Evaluation of Large Language Models" (NeurIPS 2023)☆57Updated last year
- Supporting code for ReCEval paper☆28Updated 5 months ago
- ☆28Updated last year
- ☆85Updated last year
- ☆41Updated 3 months ago
- [EMNLP 2023] Once Upon a *Time* in *Graph*: Relative-Time Pretraining for Complex Temporal Reasoning☆17Updated last year