tommasomncttn / mergeneticLinks
Flexible library for merging large language models (LLMs) via evolutionary optimization (ACL 2025 Demo).
☆46Updated last month
Alternatives and similar repositories for mergenetic
Users that are interested in mergenetic are comparing it to the libraries listed below
Sorting:
- Personal implementation of ASIF by Antonio Norelli☆25Updated last year
- A Python package for analyzing and transforming neural latent spaces.☆46Updated 6 months ago
- ☆34Updated 5 months ago
- `dattri` is a PyTorch library for developing, benchmarking, and deploying efficient data attribution algorithms.☆77Updated 2 weeks ago
- ☆26Updated 4 months ago
- nanoGPT-like codebase for LLM training☆98Updated last month
- One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation☆40Updated 8 months ago
- ☆101Updated 3 weeks ago
- ☆95Updated 4 months ago
- Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"☆74Updated 7 months ago
- Stanford NLP Python library for benchmarking the utility of LLM interpretability methods☆95Updated 3 weeks ago
- Relative representations can be leveraged to enable solving tasks regarding "latent communication": from zero-shot model stitching to lat…☆59Updated 2 years ago
- Code for reproducing our paper "Low Rank Adapting Models for Sparse Autoencoder Features"☆10Updated 2 months ago
- A toolkit for quantitative evaluation of data attribution methods.☆48Updated this week
- Why Do We Need Weight Decay in Modern Deep Learning? [NeurIPS 2024]☆66Updated 9 months ago
- PyTorch library for Active Fine-Tuning☆80Updated 4 months ago
- Attribution-based Parameter Decomposition☆25Updated 2 weeks ago
- ☆28Updated last year
- Sparse Autoencoder Training Library☆52Updated last month
- ☆53Updated 8 months ago
- This repository contains the code used for the experiments in the paper "Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity…☆27Updated last year
- Engine for collecting, uploading, and downloading model activations☆18Updated 2 months ago
- Code for reproducing our paper "Not All Language Model Features Are Linear"☆75Updated 6 months ago
- ☆53Updated last year
- ☆37Updated last year
- Replicating and dissecting the git-re-basin project in one-click-replication Colabs☆36Updated 2 years ago
- [ICLR 2025] Monet: Mixture of Monosemantic Experts for Transformers☆69Updated this week
- A library for efficient patching and automatic circuit discovery.☆67Updated 2 months ago
- ☆29Updated 4 months ago
- Open source replication of Anthropic's Crosscoders for Model Diffing☆55Updated 7 months ago