facebookresearch / SelfCiteLinks
Code for the ICML 2025 paper "SelfCite Self-Supervised Alignment for Context Attribution in Large Language Models"
☆23Updated this week
Alternatives and similar repositories for SelfCite
Users that are interested in SelfCite are comparing it to the libraries listed below
Sorting:
- The original Backpack Language Model implementation, a fork of FlashAttention☆71Updated 2 years ago
- DiffusER: Discrete Diffusion via Edit-based Reconstruction (Reid, Hellendoorn & Neubig, 2022)☆55Updated 5 months ago
- [NeurIPS 2023] Repetition In Repetition Out: Towards Understanding Neural Text Degeneration from the Data Perspective☆39Updated 2 years ago
- ☆45Updated last year
- Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-models☆143Updated 3 years ago
- Code for EMNLP 2023 findings paper "A Closer Look into Using Large Language Models for Automatic Evaluation"☆19Updated 2 years ago
- Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control☆76Updated 3 years ago
- Repo accompanying our paper "Do Llamas Work in English? On the Latent Language of Multilingual Transformers".☆80Updated last year
- Measuring the Mixing of Contextual Information in the Transformer☆34Updated 2 years ago
- contrastive decoding☆206Updated 3 years ago
- ☆19Updated 5 months ago
- Evaluate your agent memory on real-world dialogues, not LLM-simulated dialogues.☆36Updated 6 months ago
- ☆111Updated 2 years ago
- A curated list of awesome resources dedicated to Scaling Laws for LLMs☆81Updated 2 years ago
- NAACL 2022: MCSE: Multimodal Contrastive Learning of Sentence Embeddings☆58Updated last year
- ☆98Updated 3 years ago
- Official Code Repository for LM-Steer Paper: "Word Embeddings Are Steers for Language Models" (ACL 2024 Outstanding Paper Award)☆135Updated 6 months ago
- ☆103Updated 2 years ago
- [NeurIPS 2024 Spotlight] Code and data for the paper "Finding Transformer Circuits with Edge Pruning".☆64Updated 5 months ago
- Official Implementation for the ICLR2023 paper "Fuzzy Alignments in Directed Acyclic Graph for Non-autoregressive Machine Translation"☆13Updated 2 years ago
- Tasks for describing differences between text distributions.☆17Updated last year
- Introducing Filtered Direct Preference Optimization (fDPO) that enhances language model alignment with human preferences by discarding lo…☆16Updated last year
- ☆68Updated 2 years ago
- ☆108Updated last year
- Code for paper "Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning"☆84Updated 2 years ago
- Code for ICML 25 paper "Metadata Conditioning Accelerates Language Model Pre-training (MeCo)"☆49Updated 7 months ago
- Teaching Models to Express Their Uncertainty in Words☆39Updated 3 years ago
- Code for Zero-Shot Tokenizer Transfer☆142Updated last year
- [ACL'24 Oral] Analysing The Impact of Sequence Composition on Language Model Pre-Training☆23Updated last year
- ☆187Updated last year