Repo for the paper "Large Language Models Struggle to Learn Long-Tail Knowledge"
☆78Apr 12, 2023Updated 3 years ago
Alternatives and similar repositories for long_tail_knowledge
Users that are interested in long_tail_knowledge are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Resolving Knowledge Conflicts in Large Language Models, COLM 2024☆18Oct 7, 2025Updated 6 months ago
- Code for "Can Retriever-Augmented Language Models Reason? The Blame Game Between the Retriever and the Language Model", EMNLP Findings 20…☆28Nov 2, 2023Updated 2 years ago
- M2D2: A Massively Multi-domain Language Modeling Dataset (EMNLP 2022) by Machel Reid, Victor Zhong, Suchin Gururangan, Luke Zettlemoyer☆54Nov 21, 2022Updated 3 years ago
- ☆12Jun 5, 2024Updated last year
- Continual Memorization of Factoids in Large Language Models☆12Nov 20, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆11Jan 2, 2022Updated 4 years ago
- Language models scale reliably with over-training and on downstream tasks☆101Apr 2, 2024Updated 2 years ago
- ☆11Jul 15, 2020Updated 5 years ago
- ☆187Jul 2, 2025Updated 9 months ago
- AAAI 2022 Paper: Bet even Beth Harmon couldn't learn chess like that :)☆38Mar 3, 2021Updated 5 years ago
- ☆25Dec 12, 2025Updated 4 months ago
- [ICLR 2021] "InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective" by Boxin Wang, Shuohang Wang, Y…☆85Oct 25, 2023Updated 2 years ago
- ☆12Jul 6, 2023Updated 2 years ago
- Code for the CRAC 2021 paper "On Generalization in Coreference Resolution" (Best short paper award)☆36Jul 28, 2023Updated 2 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Text generation with entities as context☆30Jun 13, 2018Updated 7 years ago
- ☆57Apr 11, 2024Updated 2 years ago
- Pile Deduplication Code☆18May 15, 2023Updated 2 years ago
- ☆28Feb 17, 2024Updated 2 years ago
- NAACL 2022: Can Rationalization Improve Robustness? https://arxiv.org/abs/2204.11790☆27Nov 21, 2022Updated 3 years ago
- Code & data for EMNLP 2020 paper "MOCHA: A Dataset for Training and Evaluating Reading Comprehension Metrics".☆16May 3, 2022Updated 3 years ago
- Official codebase for permutation self-consistency.☆19Feb 11, 2024Updated 2 years ago
- Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model☆45Oct 1, 2025Updated 6 months ago
- ☆13Jul 2, 2025Updated 9 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆29Feb 26, 2024Updated 2 years ago
- Test-time-training on nearest neighbors for large language models☆50Apr 18, 2024Updated last year
- Poincaré Event Temporal Embeddings and Hyperbolic GRU for Event TempRel Extraction☆11Nov 8, 2021Updated 4 years ago
- Code and Models for the paper "End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering" (NeurIPS 20…☆110Apr 18, 2022Updated 3 years ago
- This repository provides an original implementation of Detecting Pretraining Data from Large Language Models by *Weijia Shi, *Anirudh Aji…☆244Nov 3, 2023Updated 2 years ago
- [ICML 2024] Selecting High-Quality Data for Training Language Models☆201Dec 8, 2025Updated 4 months ago
- ☆283Mar 2, 2024Updated 2 years ago
- ☆17Dec 6, 2023Updated 2 years ago
- Implementation of our paper "Towards Consistent Document-Level Entity Linking: Joint Models for Entity Linking and Coreference Resolution…☆12Nov 13, 2022Updated 3 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Source code of "Calibrating Large Language Models Using Their Generations Only", ACL2024☆22Nov 20, 2024Updated last year
- DSIR large-scale data selection framework for language model training☆272Apr 7, 2024Updated 2 years ago
- [EMNLP 2022] Differentiable Data Augmentation for Contrastive Sentence Representation Learning. https://arxiv.org/abs/2210.16536☆40Nov 1, 2022Updated 3 years ago
- [NeurIPS'24] Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models☆67Dec 10, 2024Updated last year
- Training data extraction on GPT-2☆195Feb 4, 2023Updated 3 years ago
- ☆33Nov 11, 2024Updated last year
- ☆16Mar 3, 2024Updated 2 years ago