transmuteAI / trailmet

Transmute AI Lab Model Efficiency Toolkit

☆19

Alternatives and similar repositories for trailmet

Users that are interested in trailmet are comparing it to the libraries listed below

Sorting:

ariG23498 / quantized-diffusion-inference
Notebook and Scripts that showcase running quantized diffusion models on consumer GPUs
☆37Updated 6 months ago
okarthikb / attention-visualizer
LLM attention pattern visualizer
☆10Updated last year
VijayLingam95 / SVFT
☆28Updated 3 months ago
facebookresearch / Mixture-of-Transformers
Mixture-of-Transformers A Sparse and Scalable Architecture for Multi-Modal Foundation Models. TMLR 2025. 🔗 https//arxiv.org/abs/2411.049…
☆46Updated last week
graphcore-research / jax-scalify
JAX Scalify: end-to-end scaled arithmetics
☆16Updated 6 months ago
kyleliang919 / Online-Subspace-Descent
This repo is based on https://github.com/jiaweizzhao/GaLore
☆27Updated 8 months ago
IST-DASLab / RoSA
Official implementation of the ICML 2024 paper RoSA (Robust Adaptation)
☆41Updated last year
nyunAI / Faster-LLM-Survey
☆42Updated last year
EIFY / mup-vit
Everything you need to reproduce "Better plain ViT baselines for ImageNet-1k" in PyTorch, and more
☆9Updated this week
smonsays / hypernetwork-attention
Official code for the paper "Attention as a Hypernetwork"
☆33Updated 10 months ago
Qichuzyy / POA
Official implementation of ECCV24 paper: POA
☆24Updated 9 months ago
zaydzuhri / flame
Fork of Flame repo for training of some new stuff in development
☆12Updated this week
facebookresearch / mexma
MEXMA: Token-level objectives improve sentence representations
☆41Updated 4 months ago
flowritecom / flow-merge
flow-merge is a powerful Python library that enables seamless merging of multiple transformer-based language models using the most popula…
☆17Updated 3 months ago
bentherien / mu_learned_optimization
[Oral; Neurips OPT2024 ] μLO: Compute-Efficient Meta-Generalization of Learned Optimizers
☆12Updated 2 months ago
JeanKaddour / LAWA
Latest Weight Averaging (NeurIPS HITY 2022)
☆30Updated last year
penfever / wildchat-50m
Code, results and other artifacts from the paper introducing the WildChat-50m dataset and the Re-Wild model family.
☆29Updated last month
IST-DASLab / QuEST
Work in progress.
☆62Updated last month
zqOuO / GWT
☆13Updated 4 months ago
snu-mllab / LayerMerge
Official PyTorch implementation of "LayerMerge: Neural Network Depth Compression through Layer Pruning and Merging" (ICML 2024)
☆29Updated 9 months ago
jaisidhsingh / pytorch-mixtures
One-stop solutions for Mixture of Experts and Mixture of Depth modules in PyTorch.
☆22Updated 3 weeks ago
sayakpaul / big_vision_experiments
Contains my experiments with the `big_vision` repo to train ViTs on ImageNet-1k.
☆22Updated 2 years ago
kyegomez / OpenStrawberry
An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO
☆29Updated this week
minyoungg / LTE
☆68Updated 10 months ago
IST-DASLab / MicroAdam
This repository contains code for the MicroAdam paper.
☆18Updated 5 months ago
The-Inscrutable-X / TACQ
Official Repository for Task-Circuit Quantization
☆20Updated 2 weeks ago
VITA-Group / WeLore
From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients. Ajay Jaiswal, Lu Yin, Zhenyu Zhang, Shiwei Liu,…
☆47Updated 3 weeks ago
wang-kee / LiNeS
Official repository of "LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging"
☆26Updated 6 months ago
RobertCsordas / moe_layer
sigma-MoE layer
☆18Updated last year
SHI-Labs / CompactNet
☆31Updated last year