Code for "Merging Text Transformers from Different Initializations"
☆20Feb 2, 2025Updated last year
Alternatives and similar repositories for merging-text-transformers
Users that are interested in merging-text-transformers are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Benchmark API for Multidomain Language Modeling☆25Aug 26, 2022Updated 3 years ago
- A framework for merging models solving different tasks with different initializations into one multi-task model without any additional tr…☆313Jan 18, 2024Updated 2 years ago
- LLM-Merging: Building LLMs Efficiently through Merging☆210Sep 24, 2024Updated last year
- Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning☆30Jan 25, 2023Updated 3 years ago
- Code release for Dataless Knowledge Fusion by Merging Weights of Language Models (https://openreview.net/forum?id=FCnohuR6AnM)☆92Jul 25, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Codebase the paper "The Remarkable Robustness of LLMs: Stages of Inference?"☆19Jun 11, 2025Updated 9 months ago
- Codes and files for the paper Are Emergent Abilities in Large Language Models just In-Context Learning☆33Jan 9, 2025Updated last year
- Official code for PLoP☆17Mar 6, 2026Updated 3 weeks ago
- Code for the paper "No Train, all Gain: Self-Supervised Gradients Improve Deep Frozen Representations"☆12Oct 31, 2024Updated last year
- Using fourier interpolation to merge large language models☆11Jan 6, 2026Updated 2 months ago
- Package to align tokens from different tokenizations.☆16Mar 25, 2024Updated 2 years ago
- Active Learning in the era of Foundation Models☆12Apr 16, 2025Updated 11 months ago
- [CVPR 2025] LoRA Recycle: Unlocking Tuning-Free Few-Shot Adaptability in Visual Foundation Models by Recycling Pre-Tuned LoRAs☆13Jun 20, 2025Updated 9 months ago
- Embedding Recycling for Language models☆38Jul 11, 2023Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Manage ML configuration with pydantic☆16Mar 18, 2026Updated last week
- Code for "Inducer-tuning: Connecting Prefix-tuning and Adapter-tuning" (EMNLP 2022) and "Empowering Parameter-Efficient Transfer Learning…☆11Feb 6, 2023Updated 3 years ago
- Official implementation of COLosSAL [MICCAI 2023]☆15Jul 22, 2023Updated 2 years ago
- ☆34Apr 14, 2025Updated 11 months ago
- Official Implementation (Pytorch) of the "Representation Shift: Unifying Token Compression with FlashAttention", ICCV 2025☆32Feb 22, 2026Updated last month
- Effective Attention Sheds Light On Interpretability - Findings of ACL2021☆11May 16, 2021Updated 4 years ago
- Official repository of "Localizing Task Information for Improved Model Merging and Compression" [ICML 2024]☆52Dec 22, 2025Updated 3 months ago
- ☆23Sep 19, 2024Updated last year
- ☆12Apr 3, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- A Streamlit app to add structured tags to a dataset card☆22Jun 30, 2022Updated 3 years ago
- ☆24Jun 7, 2021Updated 4 years ago
- Code for Tangent Model Composition for Ensembling and Continual Fine-tuning (ICCV 2023) and Tangent Transformers for Composition, Privacy…☆13May 14, 2024Updated last year
- ☆10Apr 16, 2024Updated last year
- The implementation for FREE-Merging: Fourier Transform for Model Merging with Lightweight Experts (ICCV25)☆14Jun 26, 2025Updated 9 months ago
- ReX - typesetting mathematics☆24Mar 10, 2026Updated 2 weeks ago
- decontamination☆27Mar 4, 2026Updated 3 weeks ago
- Providing the answer to "How to do patching on all available SAEs on GPT-2?". It is an official repository of the implementation of the p…☆13Jan 26, 2025Updated last year
- The official implementation of "Low-power, Continuous Remote Behavioral Localization with Event Cameras" (CVPR 2024)☆12Sep 25, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- JPEG编解码从零开始实现(python JPEG codec)☆10Jul 29, 2022Updated 3 years ago
- ☆48Jan 21, 2024Updated 2 years ago
- ☆14Apr 27, 2022Updated 3 years ago
- [NeurIPS 2024 Spotlight] EMR-Merging: Tuning-Free High-Performance Model Merging☆77Mar 1, 2025Updated last year
- Implementation of Bitune: Bidirectional Instruction-Tuning☆27Jun 19, 2025Updated 9 months ago
- Code for SaGe subword tokenizer (EACL 2023)☆28Nov 30, 2024Updated last year
- Code to reproduce the experiments of the ICLR24-paper: "Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging"☆12Oct 14, 2025Updated 5 months ago