Code for "Merging Text Transformers from Different Initializations"
☆20Feb 2, 2025Updated last year
Alternatives and similar repositories for merging-text-transformers
Users that are interested in merging-text-transformers are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Benchmark API for Multidomain Language Modeling☆25Aug 26, 2022Updated 3 years ago
- A framework for merging models solving different tasks with different initializations into one multi-task model without any additional tr…☆313Jan 18, 2024Updated 2 years ago
- LLM-Merging: Building LLMs Efficiently through Merging☆209Sep 24, 2024Updated last year
- A curated list of Model Merging methods.☆95Dec 3, 2025Updated 4 months ago
- Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning☆30Jan 25, 2023Updated 3 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- This repository is the implementation of the paper Training Free Pretrained Model Merging (CVPR2024).☆34Mar 5, 2024Updated 2 years ago
- ☆77Apr 29, 2024Updated last year
- Code release for Dataless Knowledge Fusion by Merging Weights of Language Models (https://openreview.net/forum?id=FCnohuR6AnM)☆92Jul 25, 2023Updated 2 years ago
- Codebase the paper "The Remarkable Robustness of LLMs: Stages of Inference?"☆19Jun 11, 2025Updated 10 months ago
- Official code for PLoP☆18Mar 6, 2026Updated last month
- Codes and files for the paper Are Emergent Abilities in Large Language Models just In-Context Learning☆33Jan 9, 2025Updated last year
- Code for the paper "No Train, all Gain: Self-Supervised Gradients Improve Deep Frozen Representations"☆12Oct 31, 2024Updated last year
- ☆214Feb 3, 2024Updated 2 years ago
- Using fourier interpolation to merge large language models☆11Jan 6, 2026Updated 3 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Package to align tokens from different tokenizations.☆16Mar 25, 2024Updated 2 years ago
- A Haskell roguelike toolkit☆12Jul 10, 2025Updated 9 months ago
- Active Learning in the era of Foundation Models☆12Apr 16, 2025Updated last year
- [CVPR 2025] LoRA Recycle: Unlocking Tuning-Free Few-Shot Adaptability in Visual Foundation Models by Recycling Pre-Tuned LoRAs☆14Jun 20, 2025Updated 9 months ago
- Embedding Recycling for Language models☆38Jul 11, 2023Updated 2 years ago
- Code for "Inducer-tuning: Connecting Prefix-tuning and Adapter-tuning" (EMNLP 2022) and "Empowering Parameter-Efficient Transfer Learning…☆11Feb 6, 2023Updated 3 years ago
- 2019~2021年间Zero-shot/Data-free知识蒸馏的论文合集☆11Sep 8, 2021Updated 4 years ago
- Official implementation of COLosSAL [MICCAI 2023]☆15Jul 22, 2023Updated 2 years ago
- ☆34Apr 14, 2025Updated last year
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Official Implementation (Pytorch) of the "Representation Shift: Unifying Token Compression with FlashAttention", ICCV 2025☆34Feb 22, 2026Updated last month
- ☆17Apr 11, 2024Updated 2 years ago
- Effective Attention Sheds Light On Interpretability - Findings of ACL2021☆11May 16, 2021Updated 4 years ago
- Official repository of "Localizing Task Information for Improved Model Merging and Compression" [ICML 2024]☆53Dec 22, 2025Updated 3 months ago
- A repository for organizing our submission to the MEDIQA-Chat Tasks @ ACL-ClinicalNLP 2023☆22Jul 21, 2023Updated 2 years ago
- ☆23Sep 19, 2024Updated last year
- ☆13Apr 3, 2024Updated 2 years ago
- A Streamlit app to add structured tags to a dataset card☆22Jun 30, 2022Updated 3 years ago
- [AAAI-25 Oral] Adaptive Calibration☆15Jul 6, 2025Updated 9 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Code for Tangent Model Composition for Ensembling and Continual Fine-tuning (ICCV 2023) and Tangent Transformers for Composition, Privacy…☆14May 14, 2024Updated last year
- Code for GFlowNet-DPO (Direct Preference Optimization) EMNLP 2024 Main☆19Feb 22, 2026Updated last month
- ☆10Apr 16, 2024Updated 2 years ago
- The implementation for FREE-Merging: Fourier Transform for Model Merging with Lightweight Experts (ICCV25)☆14Jun 26, 2025Updated 9 months ago
- ReX - typesetting mathematics☆24Mar 10, 2026Updated last month
- Providing the answer to "How to do patching on all available SAEs on GPT-2?". It is an official repository of the implementation of the p…☆13Jan 26, 2025Updated last year
- The official implementation of "Low-power, Continuous Remote Behavioral Localization with Event Cameras" (CVPR 2024)☆12Sep 25, 2024Updated last year