Gryphe / MergeMonsterLinks

An unsupervised model merging algorithm for Transformers-based language models.

☆106

Alternatives and similar repositories for MergeMonster

Users that are interested in MergeMonster are comparing it to the libraries listed below

Sorting:

thomasgauthier / LoRD
Low-Rank adapter extraction for fine-tuned transformers models
☆175Updated last year
Digitous / ModelREVOLVER
Model REVOLVER, a human in the loop model mixing system.
☆33Updated 2 years ago
Gryphe / BlockMerge_Gradient
Merge Transformers language models by use of gradient parameters.
☆206Updated 11 months ago
QuixiAI / laserRMT
This is our own implementation of 'Layer Selective Rank Reduction'
☆239Updated last year
eugenepentland / landmark-attention-qlora
Landmark Attention: Random-Access Infinite Context Length for Transformers QLoRA
☆123Updated 2 years ago
zarakiquemparte / zaraki-tools
☆27Updated last year
uukuguy / multi_loras
Load multiple LoRA modules simultaneously and automatically switch the appropriate combination of LoRA modules to generate the best answe…
☆156Updated last year
thooton / muse
Let's create synthetic textbooks together :)
☆75Updated last year
CoffeeVampir3 / ez-trainer
Train Llama Loras Easily
☆31Updated 2 years ago
euclaise / SlimTrainer
Full finetuning of large language models without large memory requirements
☆94Updated last year
VatsaDev / NanoPhi-alpha
GPT-2 small trained on phi-like data
☆67Updated last year
the-crypt-keeper / LLooM
Experimental LLM Inference UX to aid in creative writing
☆119Updated 7 months ago
jondurbin / qlora
QLoRA: Efficient Finetuning of Quantized LLMs
☆78Updated last year
TheBlokeAI / AIScripts
Some simple scripts that I use day-to-day when working with LLMs and Huggingface Hub
☆162Updated last year
emrgnt-cmplxty / zero-shot-replication
☆74Updated last year
AlpinDale / sparsegpt-for-LLaMA
Code for the paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot" with LLaMA implementation.
☆71Updated 2 years ago
jukofyork / control-vectors
Genertaes control vectors for use with llama.cpp in GGUF format.
☆28Updated 4 months ago
Mihaiii / backtrack_sampler
An easy-to-understand framework for LLM samplers that rewind and revise generated tokens
☆140Updated 5 months ago
teknium1 / ShareGPT-Builder
☆115Updated 7 months ago
the-crypt-keeper / the-muse
Experimental sampler to make LLMs more creative
☆31Updated 2 years ago
QuixiAI / OpenChatML
☆157Updated last year
QuixiAI / grokadamw
☆134Updated 11 months ago
taprosoft / llm_finetuning
Convenient wrapper for fine-tuning and inference of Large Language Models (LLMs) with several quantization techniques (GTPQ, bitsandbytes…
☆146Updated last year
QuixiAI / kraken
☆66Updated last year
tdrussell / qlora-pipe
A pipeline parallel training script for LLMs.
☆153Updated 3 months ago
TehVenomm / LM_Transformers_BlockMerge
Image Diffusion block merging technique applied to transformers based Language Models.
☆54Updated 2 years ago
EQ-bench / EQ-Bench
A benchmark for emotional intelligence in large language models
☆332Updated last year
rafacelente / bllama
1.58-bit LLaMa model
☆81Updated last year
sdan / selfextend
an implementation of Self-Extend, to expand the context window via grouped attention
☆119Updated last year
danikhan632 / guidance_api
An Extension for oobabooga/text-generation-webui
☆36Updated 2 years ago