ml-jku / EVALinks

One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation

☆41

Alternatives and similar repositories for EVA

Users that are interested in EVA are comparing it to the libraries listed below

Sorting:

epfml / schedules-and-scaling
Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"
☆81Updated 9 months ago
RobertCsordas / moeut
☆83Updated 11 months ago
mcleish7 / gemstone-scaling-laws
☆27Updated 5 months ago
yidingjiang / ado
The repository contains code for Adaptive Data Optimization
☆25Updated 7 months ago
ahans30 / goldfish-loss
[NeurIPS 2024] Goldfish Loss: Mitigating Memorization in Generative LLMs
☆91Updated 8 months ago
mlfoundations / scaling
Language models scale reliably with over-training and on downstream tasks
☆97Updated last year
shreyansh26 / Attention-Mask-Patterns
Using FlexAttention to compute attention with different masking patterns
☆44Updated 10 months ago
wang-kee / LiNeS
Official repository of "LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging"
☆30Updated 9 months ago
TRI-ML / linear_open_lm
A repository for research on medium sized language models.
☆78Updated last year
epfml / DenseFormer
☆81Updated last year
minyoungg / LTE
☆68Updated last year
lucidrains / pause-transformer
Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount…
☆54Updated last year
jonhue / activeft
PyTorch library for Active Fine-Tuning
☆87Updated 5 months ago
siyan-zhao / prepacking
The source code of our work "Prepacking: A Simple Method for Fast Prefilling and Increased Throughput in Large Language Models" [AISTATS …
☆61Updated 9 months ago
r-three / phatgoose
Code for PHATGOOSE introduced in "Learning to Route Among Specialized Experts for Zero-Shot Generalization"
☆86Updated last year
katiekang1998 / reasoning_generalization
☆34Updated 6 months ago
JeanKaddour / LAWA
Latest Weight Averaging (NeurIPS HITY 2022)
☆31Updated 2 years ago
ltgoslo / bert-in-context
Official implementation of "BERTs are Generative In-Context Learners"
☆31Updated 4 months ago
amirzandieh / HyperAttention
Triton Implementation of HyperAttention Algorithm
☆48Updated last year
insuhan / hyper-attn
☆81Updated last year
ScalingIntelligence / large_language_monkeys
☆101Updated 10 months ago
kyleliang919 / Online-Subspace-Descent
[NeurIPS 2024] Low rank memory efficient optimizer without SVD
☆30Updated last month
MadryLab / DsDm
☆50Updated last year
epfml / llm-baselines
nanoGPT-like codebase for LLM training
☆102Updated 2 months ago
s-sahoo / Eso-LMs
Esoteric Language Models
☆89Updated last week
fal-ai-community / nano-mdm
Tiny re-implementation of MDM in style of LLaDA and nano-gpt speedrun
☆55Updated 4 months ago
JeanKaddour / NoTrainNoGain
Revisiting Efficient Training Algorithms For Transformer-based Language Models (NeurIPS 2023)
☆80Updated last year
lucidrains / CALM-pytorch
Implementation of CALM from the paper "LLM Augmented LLMs: Expanding Capabilities through Composition", out of Google Deepmind
☆177Updated 10 months ago
ContextualAI / CLAIR_and_APO
Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment
☆60Updated 11 months ago
lucidrains / infini-transformer-pytorch
Implementation of Infini-Transformer in Pytorch
☆110Updated 7 months ago