bloomberg / dataless-model-merging
Code release for Dataless Knowledge Fusion by Merging Weights of Language Models (https://openreview.net/forum?id=FCnohuR6AnM)
☆84Updated last year
Related projects ⓘ
Alternatives and complementary repositories for dataless-model-merging
- ☆148Updated 9 months ago
- Progressive Prompts: Continual Learning for Language Models☆91Updated last year
- ☆126Updated 2 years ago
- ☆44Updated 2 months ago
- Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-models☆133Updated 2 years ago
- AI Logging for Interpretability and Explainability🔬☆87Updated 5 months ago
- contrastive decoding☆178Updated last year
- [NeurIPS'23] Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors☆69Updated 8 months ago
- SILO Language Models code repository☆80Updated 8 months ago
- ☆61Updated 2 years ago
- Retrieval as Attention☆82Updated last year
- ☆78Updated last year
- Function Vectors in Large Language Models (ICLR 2024)☆116Updated 3 weeks ago
- ☆50Updated last year
- [ICLR 2023] Code for our paper "Selective Annotation Makes Language Models Better Few-Shot Learners"☆105Updated last year
- Building modular LMs with parameter-efficient fine-tuning.☆80Updated this week
- ☆27Updated last year
- Self-Alignment with Principle-Following Reward Models☆148Updated 8 months ago
- Repo for the paper "Large Language Models Struggle to Learn Long-Tail Knowledge"☆73Updated last year
- This is the implementation of the paper AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning (https://arxiv.org/abs/2205.1…☆126Updated last year
- This is the oficial repository for "Parameter-Efficient Multi-task Tuning via Attentional Mixtures of Soft Prompts" (EMNLP 2022)☆97Updated last year
- ☆78Updated last year
- [NeurIPS 2024] Knowledge Circuits in Pretrained Transformers☆66Updated 3 weeks ago
- ☆76Updated last year
- Code accompanying the paper "Massive Activations in Large Language Models"☆121Updated 8 months ago
- The Paper List on Data Contamination for Large Language Models Evaluation.☆73Updated this week
- PASTA: Post-hoc Attention Steering for LLMs☆107Updated 2 months ago
- Source code of "Task arithmetic in the tangent space: Improved editing of pre-trained models".☆84Updated last year
- ☆33Updated last year
- Inspecting and Editing Knowledge Representations in Language Models☆107Updated last year