Repo for the paper "Large Language Models Struggle to Learn Long-Tail Knowledge"
☆78Apr 12, 2023Updated 2 years ago
Alternatives and similar repositories for long_tail_knowledge
Users that are interested in long_tail_knowledge are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code for "Can Retriever-Augmented Language Models Reason? The Blame Game Between the Retriever and the Language Model", EMNLP Findings 20…☆28Nov 2, 2023Updated 2 years ago
- M2D2: A Massively Multi-domain Language Modeling Dataset (EMNLP 2022) by Machel Reid, Victor Zhong, Suchin Gururangan, Luke Zettlemoyer☆54Nov 21, 2022Updated 3 years ago
- Official codebase accompanying our ACL 2022 paper "RELiC: Retrieving Evidence for Literary Claims" (https://relic.cs.umass.edu).☆20May 14, 2022Updated 3 years ago
- ☆11Jun 5, 2024Updated last year
- Code associated with the paper: "Few-Shot Self-Rationalization with Natural Language Prompts"☆13Apr 27, 2022Updated 3 years ago
- Continual Memorization of Factoids in Large Language Models☆12Nov 20, 2024Updated last year
- ☆11Jan 2, 2022Updated 4 years ago
- Language models scale reliably with over-training and on downstream tasks☆101Apr 2, 2024Updated last year
- ☆11Jul 15, 2020Updated 5 years ago
- ☆187Jul 2, 2025Updated 8 months ago
- AAAI 2022 Paper: Bet even Beth Harmon couldn't learn chess like that :)☆38Mar 3, 2021Updated 5 years ago
- Code for paper 'Are We Falling in a Middle-Intelligence Trap? An Analysis and Mitigation of the Reversal Curse'☆13Aug 2, 2024Updated last year
- ☆25Dec 12, 2025Updated 3 months ago
- [ICLR 2021] "InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective" by Boxin Wang, Shuohang Wang, Y…☆85Oct 25, 2023Updated 2 years ago
- Can VLMs understand students' hand-drawn math work?☆17Jan 20, 2026Updated 2 months ago
- ☆17Aug 2, 2023Updated 2 years ago
- ☆12Jul 6, 2023Updated 2 years ago
- ☆15Jan 9, 2026Updated 2 months ago
- Code for the CRAC 2021 paper "On Generalization in Coreference Resolution" (Best short paper award)☆36Jul 28, 2023Updated 2 years ago
- Code for SLT 2016 paper on Grapheme-to-Phoneme conversion using attention based encoder-decoder models☆15Feb 20, 2019Updated 7 years ago
- ☆56Apr 11, 2024Updated last year
- Pile Deduplication Code☆18May 15, 2023Updated 2 years ago
- Official codebase for permutation self-consistency.☆18Feb 11, 2024Updated 2 years ago
- ☆28Feb 17, 2024Updated 2 years ago
- NAACL 2022: Can Rationalization Improve Robustness? https://arxiv.org/abs/2204.11790☆27Nov 21, 2022Updated 3 years ago
- Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model☆45Oct 1, 2025Updated 5 months ago
- ☆13Jul 2, 2025Updated 8 months ago
- Test-time-training on nearest neighbors for large language models☆49Apr 18, 2024Updated last year
- Code and Models for the paper "End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering" (NeurIPS 20…☆110Apr 18, 2022Updated 3 years ago
- This repository provides an original implementation of Detecting Pretraining Data from Large Language Models by *Weijia Shi, *Anirudh Aji…☆241Nov 3, 2023Updated 2 years ago
- [ICML 2024] Selecting High-Quality Data for Training Language Models☆201Dec 8, 2025Updated 3 months ago
- ☆282Mar 2, 2024Updated 2 years ago
- ☆17Dec 6, 2023Updated 2 years ago
- Source code of "Calibrating Large Language Models Using Their Generations Only", ACL2024☆22Nov 20, 2024Updated last year
- DSIR large-scale data selection framework for language model training☆271Apr 7, 2024Updated last year
- [EMNLP 2022] Differentiable Data Augmentation for Contrastive Sentence Representation Learning. https://arxiv.org/abs/2210.16536☆40Nov 1, 2022Updated 3 years ago
- [NeurIPS'24] Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models☆67Dec 10, 2024Updated last year
- Github repository for "Internalizing World Models via Self-Play Finetuning for Agentic RL"☆33Nov 1, 2025Updated 4 months ago
- Training data extraction on GPT-2☆197Feb 4, 2023Updated 3 years ago