An unsupervised model merging algorithm for Transformers-based language models.
☆108Apr 29, 2024Updated last year
Alternatives and similar repositories for MergeMonster
Users that are interested in MergeMonster are comparing it to the libraries listed below
Sorting:
- Merge Transformers language models by use of gradient parameters.☆214Aug 8, 2024Updated last year
- This is our own implementation of 'Layer Selective Rank Reduction'☆240May 26, 2024Updated last year
- EXL2 quantization generalized to other models.☆10Mar 17, 2024Updated last year
- Model REVOLVER, a human in the loop model mixing system.☆33Aug 2, 2023Updated 2 years ago
- Using fourier interpolation to merge large language models☆11Jan 6, 2026Updated last month
- 收集优质的角色扮演聊天数据 | Collection of roleplay conversations of high quality☆15Dec 1, 2024Updated last year
- Entropy Based Sampling and Parallel CoT Decoding☆17Oct 9, 2024Updated last year
- automatically quant GGUF models☆220Dec 23, 2025Updated 2 months ago
- Image Diffusion block merging technique applied to transformers based Language Models.☆56May 8, 2023Updated 2 years ago
- Large-scale LLM inference engine☆1,658Feb 17, 2026Updated last week
- Automated Identification of Redundant Layer Blocks for Pruning in Large Language Models☆261Apr 23, 2024Updated last year
- QuIP quantization☆62Mar 17, 2024Updated last year
- Tools for merging pretrained large language models.☆6,814Jan 26, 2026Updated last month
- Aioli: A unified optimization framework for language model data mixing☆32Jan 17, 2025Updated last year
- Sakura-SOLAR-DPO: Merge, SFT, and DPO☆116Dec 30, 2023Updated 2 years ago
- Repository containing the SPIN experiments on the DIBT 10k ranked prompts☆23Mar 12, 2024Updated last year
- Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free☆233Oct 31, 2024Updated last year
- A bagel, with everything.☆326Apr 11, 2024Updated last year
- 8HP Eurorack module | Granular audio processor☆11Mar 23, 2021Updated 4 years ago
- A chat implementation for FastHTML☆11Sep 14, 2025Updated 5 months ago
- ☆11Dec 11, 2024Updated last year
- 一起来养一只拥有专属记忆的AI猫猫吧!☆10Oct 25, 2024Updated last year
- notes, config, tools, etc. for kicking the tires on cockroachdb☆11Apr 8, 2025Updated 10 months ago
- Modified Beam Search with periodical restart☆12Sep 12, 2024Updated last year
- Spherical Merge Pytorch/HF format Language Models with minimal feature loss.☆144Sep 10, 2023Updated 2 years ago
- A repository of prompts and Python scripts for intelligent transformation of raw text into diverse formats.☆31May 29, 2023Updated 2 years ago
- An experiment to see if chatgpt can improve the output of the stanford alpaca dataset☆12Mar 29, 2023Updated 2 years ago
- Train Llama Loras Easily☆31Aug 3, 2023Updated 2 years ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆59Oct 18, 2025Updated 4 months ago
- My Implementation of Q-Sparse: All Large Language Models can be Fully Sparsely-Activated☆33Aug 14, 2024Updated last year
- An efficent implementation of the method proposed in "The Era of 1-bit LLMs"☆155Oct 15, 2024Updated last year
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆34Mar 2, 2024Updated last year
- Control LLM generation format efficiently. A simple version of microsoft/aici in vllm and transformers☆14Jun 7, 2024Updated last year
- A quick and optimized solution to manage llama based gguf quantized models, download gguf files, retreive messege formatting, add more mo…☆12Jan 13, 2024Updated 2 years ago
- Automatically evaluate your LLMs in Google Colab☆686May 7, 2024Updated last year
- EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language M…☆252Oct 30, 2024Updated last year
- OpenAI GPT hosted Agent Framework for Windows and MacOS☆36Jul 8, 2024Updated last year
- Low-Rank adapter extraction for fine-tuned transformers models☆180May 2, 2024Updated last year
- Code for the NeurIPS 2021 paper "Deep Bandits Show-Off: Simple and Efficient Exploration with Deep Networkst"☆14Sep 12, 2022Updated 3 years ago