Repo for the paper "Large Language Models Struggle to Learn Long-Tail Knowledge"
☆77Apr 12, 2023Updated 3 years ago
Alternatives and similar repositories for long_tail_knowledge
Users that are interested in long_tail_knowledge are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Resolving Knowledge Conflicts in Large Language Models, COLM 2024☆18Oct 7, 2025Updated 8 months ago
- Code for "Can Retriever-Augmented Language Models Reason? The Blame Game Between the Retriever and the Language Model", EMNLP Findings 20…☆28Nov 2, 2023Updated 2 years ago
- M2D2: A Massively Multi-domain Language Modeling Dataset (EMNLP 2022) by Machel Reid, Victor Zhong, Suchin Gururangan, Luke Zettlemoyer☆54Nov 21, 2022Updated 3 years ago
- Official codebase accompanying our ACL 2022 paper "RELiC: Retrieving Evidence for Literary Claims" (https://relic.cs.umass.edu).☆20May 14, 2022Updated 4 years ago
- Continual Memorization of Factoids in Large Language Models☆12Nov 20, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Code associated with the paper: "Few-Shot Self-Rationalization with Natural Language Prompts"☆12Apr 27, 2022Updated 4 years ago
- Language models scale reliably with over-training and on downstream tasks☆101Apr 2, 2024Updated 2 years ago
- ☆11Jul 15, 2020Updated 5 years ago
- ☆192Jul 2, 2025Updated last year
- AAAI 2022 Paper: Bet even Beth Harmon couldn't learn chess like that :)☆39Mar 3, 2021Updated 5 years ago
- Code for paper 'Are We Falling in a Middle-Intelligence Trap? An Analysis and Mitigation of the Reversal Curse'☆14Aug 2, 2024Updated last year
- [ICLR 2021] "InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective" by Boxin Wang, Shuohang Wang, Y…☆86Oct 25, 2023Updated 2 years ago
- ☆17Aug 2, 2023Updated 2 years ago
- ☆12Jul 6, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆15Jan 9, 2026Updated 5 months ago
- Code for the CRAC 2021 paper "On Generalization in Coreference Resolution" (Best short paper award)☆35Jul 28, 2023Updated 2 years ago
- Code for SLT 2016 paper on Grapheme-to-Phoneme conversion using attention based encoder-decoder models☆15Feb 20, 2019Updated 7 years ago
- Text generation with entities as context☆30Jun 13, 2018Updated 8 years ago
- ☆57Apr 11, 2024Updated 2 years ago
- Pile Deduplication Code☆18May 15, 2023Updated 3 years ago
- [ICCV 2023] Improving Adversarial Robustness of Masked Autoencoders via Test-time Frequency-domain Prompting☆15Nov 30, 2023Updated 2 years ago
- [EMNLP 2022] Code for our paper “ZeroGen: Efficient Zero-shot Learning via Dataset Generation”.☆47Feb 18, 2022Updated 4 years ago
- ☆29Feb 17, 2024Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- NAACL 2022: Can Rationalization Improve Robustness? https://arxiv.org/abs/2204.11790☆27Nov 21, 2022Updated 3 years ago
- Code & data for EMNLP 2020 paper "MOCHA: A Dataset for Training and Evaluating Reading Comprehension Metrics".☆16May 3, 2022Updated 4 years ago
- Utilities for PyTorch distributed☆25Feb 27, 2025Updated last year
- Official codebase for permutation self-consistency.☆19Feb 11, 2024Updated 2 years ago
- Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model☆45Oct 1, 2025Updated 9 months ago
- ☆13Jul 2, 2025Updated last year
- ☆29Feb 26, 2024Updated 2 years ago
- Test-time-training on nearest neighbors for large language models☆50Apr 18, 2024Updated 2 years ago
- Poincaré Event Temporal Embeddings and Hyperbolic GRU for Event TempRel Extraction☆11Nov 8, 2021Updated 4 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Code and Models for the paper "End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering" (NeurIPS 20…☆110Apr 18, 2022Updated 4 years ago
- This repository provides an original implementation of Detecting Pretraining Data from Large Language Models by *Weijia Shi, *Anirudh Aji…☆242Nov 3, 2023Updated 2 years ago
- [ICML 2024] Selecting High-Quality Data for Training Language Models☆204Dec 8, 2025Updated 6 months ago
- ☆287Mar 2, 2024Updated 2 years ago
- Source code of "Calibrating Large Language Models Using Their Generations Only", ACL2024☆22Nov 20, 2024Updated last year
- DSIR large-scale data selection framework for language model training☆274Apr 7, 2024Updated 2 years ago
- [EMNLP 2022] Differentiable Data Augmentation for Contrastive Sentence Representation Learning. https://arxiv.org/abs/2210.16536☆40Nov 1, 2022Updated 3 years ago