uiuctml / MergeBenchLinks

MergeBench: A Benchmark for Merging Domain-Specialized LLMs

☆18

Alternatives and similar repositories for MergeBench

Users that are interested in MergeBench are comparing it to the libraries listed below

Sorting:

bminixhofer / tokenkit
A toolkit implementing advanced methods to transfer models and model knowledge across tokenizers.
☆41Updated last month
bminixhofer / zett
Code for Zero-Shot Tokenizer Transfer
☆135Updated 6 months ago
r-three / RAD
Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model
☆44Updated last year
PythonNut / superbpe
Official code release for "SuperBPE: Space Travel for Language Models"
☆63Updated 3 weeks ago
sfeucht / footprints
https://footprints.baulab.info
☆17Updated 10 months ago
ZurichNLP / mbr
Minimum Bayes Risk Decoding for Hugging Face Transformers
☆58Updated last year
hadasah / btm
☆75Updated last year
ThomasScialom / T0_continual_learning
Adding new tasks to T0 without catastrophic forgetting
☆33Updated 2 years ago
yuzhaouoe / pretraining-data-packing
[ACL'24 Oral] Analysing The Impact of Sequence Composition on Language Model Pre-Training
☆22Updated 11 months ago
kaistAI / factual-knowledge-acquisition
☆21Updated 3 months ago
allenai / bff
☆39Updated last year
jmerullo / lm_vector_arithmetic
☆35Updated 2 years ago
guy-dar / embedding-space
☆54Updated 2 years ago
GSYfate / knnlm-limits
Official code repo for paper "Great Memory, Shallow Reasoning: Limits of kNN-LMs"
☆23Updated 3 months ago
TristanThrush / perplexity-correlations
Simple and scalable tools for data-driven pretraining data selection.
☆25Updated 2 months ago
mlfoundations / scaling
Language models scale reliably with over-training and on downstream tasks
☆97Updated last year
mcleish7 / gemstone-scaling-laws
☆28Updated 5 months ago
srush / LLM-Talk
☆51Updated last year
huggingface / olm-training
Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.
☆93Updated 2 years ago
McGill-NLP / CHASE
Synthetic Data Generation for Evaluation
☆15Updated 5 months ago
CodeCreator / WebOrganizer
Organize the Web: Constructing Domains Enhances Pre-Training Data Curation
☆60Updated 3 months ago
Nix07 / finetuning
This repository contains the code used for the experiments in the paper "Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity…
☆27Updated last year
epfml / schedules-and-scaling
Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"
☆81Updated 9 months ago
JHU-CLSP / ettin-encoder-vs-decoder
State-of-the-art paired encoder and decoder models (17M-1B params)
☆38Updated last week
frankxu2004 / knnlm-why
Repo for ICML23 "Why do Nearest Neighbor Language Models Work?"
☆58Updated 2 years ago
princeton-nlp / LM-Kernel-FT
A Kernel-Based View of Language Model Fine-Tuning https://arxiv.org/abs/2210.05643
☆78Updated last year
orevaahia / magnet-tokenization
☆13Updated 8 months ago
SimengSun / ChapterBreak
☆11Updated last year
NathanGodey / headless-lm
Training and evaluation code for the paper "Headless Language Models: Learning without Predicting with Contrastive Weight Tying" (https:/…
☆27Updated last year
huggingface / that_is_good_data
☆66Updated 2 years ago