pixeli99/MixLN

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/pixeli99/MixLN)

pixeli99 / MixLN

[ICLR 2025] Official Pytorch Implementation of "Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN" by Pengxiang Li, Lu Yin, Shiwei Liu

☆30

Alternatives and similar repositories for MixLN

Users that are interested in MixLN are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

pixeli99 / OWS
View on GitHub
Official Pytorch Implementation of "Outlier-weighed Layerwise Sampling for LLM Fine-tuning" by Pengxiang Li, Lu Yin, Xiaowei Gao, Shiwei …
☆35Jun 3, 2025Updated last year
zlab-princeton / llm-pruning-collection
View on GitHub
A collection of various llm pruning implementations, training code for GPUs & TPUs, and evaluation script.
☆69Apr 20, 2026Updated 3 months ago
TianjinYellow / SPAM-Optimizer
View on GitHub
☆36Mar 12, 2025Updated last year
CodeEval-Pro / CodeEval-Pro
View on GitHub
[ACL'25 Findings] Official repo for "HumanEval Pro and MBPP Pro: Evaluating Large Language Models on Self-invoking Code Generation Task"
☆40Apr 7, 2025Updated last year
HelmholtzAI-FZJ / flex_gen
View on GitHub
☆20Jan 10, 2025Updated last year
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
TianjinYellow / StableSPAM
View on GitHub
☆28Jul 2, 2026Updated 3 weeks ago
stephenqz / OATS
View on GitHub
Github Repo for OATS: Outlier-Aware Pruning through Sparse and Low Rank Decomposition
☆20Apr 16, 2025Updated last year
fmfi-compbio / admm-pruning
View on GitHub
☆30Jul 22, 2024Updated 2 years ago
parsa-epfl / quantization-sparsity-interplay
View on GitHub
This repo contains the code for studying the interplay between quantization and sparsity methods
☆26Feb 26, 2025Updated last year
wmn-231314 / diffusion-data-constraint
View on GitHub
Official PyTorch implementation and models for paper "Diffusion Beats Autoregressive in Data-Constrained Settings". We find diffusion mod…
☆127Jan 10, 2026Updated 6 months ago
Leey21 / CipherBank
View on GitHub
☆14Jun 13, 2025Updated last year
GATECH-EIC / Linearized-LLM
View on GitHub
[ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models
☆35Jun 12, 2024Updated 2 years ago
Lucky-Lance / SPP
View on GitHub
[ICML 2024] SPP: Sparsity-Preserved Parameter-Efficient Fine-Tuning for Large Language Models
☆22May 28, 2024Updated 2 years ago
clearloveclearlove / BEAT
View on GitHub
☆15Feb 26, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Anonymous1252022 / fp4-all-the-way
View on GitHub
☆52May 20, 2025Updated last year
CASE-Lab-UMD / LLM-Drop
View on GitHub
The official implementation of the paper "Uncovering the Redundancy in Transformers via a Unified Study of Layer Dropping (TMLR)".
☆191Apr 23, 2026Updated 3 months ago
MikeWangWZHL / dymu
View on GitHub
☆29May 13, 2025Updated last year
zaydzuhri / flame
View on GitHub
Fork of Flame repo for training of some new stuff in development
☆20Jul 15, 2026Updated last week
lmsdss / LayerNorm-Scaling
View on GitHub
[NeurIPS 2025] Official Pytorch Implementation of "The Curse of Depth in Large Language Models" by Wenfang Sun, Xinyuan Song, Pengxiang L…
☆72Mar 3, 2026Updated 4 months ago
jylei16 / Imagine-e
View on GitHub
☆14Jan 22, 2025Updated last year
Longin-Yu / ComRoPE
View on GitHub
☆11Jun 11, 2025Updated last year
MaitySubhajit / KArAt
View on GitHub
Kolmogorov-Arnold Attention: Is Learnable Attention Better for Vision Transformers?
☆16Jul 9, 2025Updated last year
thunlp / SparsingLaw
View on GitHub
The open-source materials for paper "Sparsing Law: Towards Large Language Models with Greater Activation Sparsity".
☆32Nov 12, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Guinan-Su / auto-merge-llm
View on GitHub
An official repository for GPTailor
☆18Jun 29, 2025Updated last year
dibbla / Quantized-Evolution-Strategies
View on GitHub
Quantized Evolution Strategies: High-precision Fine-tuning of Quantized LLMs at Low-precision Cost
☆21May 14, 2026Updated 2 months ago
DAMO-NLP-SG / LLM-Multilingual-Knowledge-Boundaries
View on GitHub
[ACL 2025] Analyzing LLMs' Multilingual Knowledge Boundary Cognition Across Languages Through the Lens of Internal Representations
☆19Oct 18, 2025Updated 9 months ago
syjmelody / RankE
View on GitHub
Implementation of RankE: End-to-End Discrete Text-to-Image Post-Training via Rank-Consistent Alignment
☆20May 27, 2026Updated last month
fastconvnets / cvpr2020
View on GitHub
Code for "Fast Sparse ConvNets" CVPR2020 submissions
☆12Nov 20, 2019Updated 6 years ago
Twilight92z / Quantize-Watermark
View on GitHub
☆19Nov 6, 2023Updated 2 years ago
PythonNut / superbpe
View on GitHub
Official code release for "SuperBPE: Space Travel for Language Models"
☆97May 28, 2026Updated last month
dongwonjo / FastKV
View on GitHub
[ACL Findings 2026] Official Implementation of "FastKV: Decoupling of Context Reduction and KV Cache Compression for Prefill-Decoding Acc…
☆32Apr 14, 2026Updated 3 months ago
iLearn-Lab / ACL25-PTQ1.61
View on GitHub
☆15Apr 6, 2026Updated 3 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
TianjinYellow / UGTs-LoG
View on GitHub
This is the official code for UGTs.
☆13Feb 8, 2023Updated 3 years ago
RAIVNLab / MatFormer-OLMo
View on GitHub
Code repository for the public reproduction of the language modelling experiments on "MatFormer: Nested Transformer for Elastic Inference…
☆31Nov 14, 2023Updated 2 years ago
sanderland / script_tok
View on GitHub
Code for the paper "BPE stays on SCRIPT", "Which Pieces Does Unigram Tokenization Really Need?" and MinGram
☆18Jun 26, 2026Updated 3 weeks ago
uiuctml / MergeBench
View on GitHub
[NeurIPS 2025] MergeBench: A Benchmark for Merging Domain-Specialized LLMs
☆47Feb 11, 2026Updated 5 months ago
Purshow / Awesome-LVLM-Hallucination
View on GitHub
☆56Nov 26, 2024Updated last year
rimads / avey-b
View on GitHub
Code for the Avey-B paper (https://arxiv.org/abs/2602.15814)
☆32Feb 21, 2026Updated 5 months ago
jiwonsong-dev / SLEB
View on GitHub
[ICML 2024] Official Implementation of SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks
☆42Feb 4, 2025Updated last year