princeton-nlp / LM-Science-Tutor
☆39Updated 6 months ago
Alternatives and similar repositories for LM-Science-Tutor:
Users that are interested in LM-Science-Tutor are comparing it to the libraries listed below
- Code and data for paper "Context-faithful Prompting for Large Language Models".☆39Updated last year
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"☆54Updated 11 months ago
- Prompting Large Language Models to Generate Dense and Sparse Representations for Zero-Shot Document Retrieval☆40Updated 3 months ago
- ☆44Updated 5 months ago
- [arXiv preprint] Official Repository for "Evaluating Language Models as Synthetic Data Generators"☆34Updated 2 months ago
- Grade-School Math with Irrelevant Context (GSM-IC) benchmark is an arithmetic reasoning dataset built upon GSM8K, by adding irrelevant se…☆58Updated 2 years ago
- Evaluate the Quality of Critique☆35Updated 8 months ago
- WikiWhy is a new benchmark for evaluating LLMs' ability to explain between cause-effect relationships. It is a QA dataset containing 9000…☆47Updated last year
- [ICLR'24 Spotlight] "Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts"☆65Updated 10 months ago
- Supporting code for ReCEval paper☆28Updated 5 months ago
- ☆47Updated 10 months ago
- [EMNLP 2024] A Retrieval Benchmark for Scientific Literature Search☆69Updated 2 months ago
- Repo for the paper "Large Language Models Struggle to Learn Long-Tail Knowledge"☆74Updated last year
- [ICLR 2023] Code for our paper "Selective Annotation Makes Language Models Better Few-Shot Learners"☆108Updated last year
- IntructIR, a novel benchmark specifically designed to evaluate the instruction following ability in information retrieval models. Our foc…☆31Updated 8 months ago
- ☆23Updated last month
- FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions☆42Updated 7 months ago
- This repository contains the dataset and code for "WiCE: Real-World Entailment for Claims in Wikipedia" in EMNLP 2023.☆40Updated last year
- Code for the arXiv paper: "LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond"☆59Updated 3 weeks ago
- The official code and dataset for EMNLP 2022 paper "COPEN: Probing Conceptual Knowledge in Pre-trained Language Models".☆19Updated last year
- BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval☆75Updated last week
- ☆40Updated last year
- A Human-LLM Collaborative Dataset for Generative Information-seeking with Attribution☆30Updated last year
- the instructions and demonstrations for building a formal logical reasoning capable GLM☆53Updated 5 months ago
- AbstainQA, ACL 2024☆25Updated 4 months ago
- "FiD-ICL: A Fusion-in-Decoder Approach for Efficient In-Context Learning" (ACL 2023)☆13Updated last year
- [EMNLP'24 (Main)] DRPO(Dynamic Rewarding with Prompt Optimization) is a tuning-free approach for self-alignment. DRPO leverages a search-…☆20Updated 3 months ago
- [ACL'24 Oral] Analysing The Impact of Sequence Composition on Language Model Pre-Training☆19Updated 6 months ago
- ☆66Updated last year
- ☆33Updated 3 years ago