transmuteAI / trailmet
Transmute AI Lab Model Efficiency Toolkit
☆19Updated last year
Alternatives and similar repositories for trailmet
Users that are interested in trailmet are comparing it to the libraries listed below
Sorting:
- Notebook and Scripts that showcase running quantized diffusion models on consumer GPUs☆37Updated 6 months ago
- LLM attention pattern visualizer☆10Updated last year
- ☆28Updated 3 months ago
- Mixture-of-Transformers A Sparse and Scalable Architecture for Multi-Modal Foundation Models. TMLR 2025. 🔗 https//arxiv.org/abs/2411.049…☆46Updated last week
- JAX Scalify: end-to-end scaled arithmetics☆16Updated 6 months ago
- This repo is based on https://github.com/jiaweizzhao/GaLore☆27Updated 8 months ago
- Official implementation of the ICML 2024 paper RoSA (Robust Adaptation)☆41Updated last year
- ☆42Updated last year
- Everything you need to reproduce "Better plain ViT baselines for ImageNet-1k" in PyTorch, and more☆9Updated this week
- Official code for the paper "Attention as a Hypernetwork"☆33Updated 10 months ago
- Official implementation of ECCV24 paper: POA☆24Updated 9 months ago
- Fork of Flame repo for training of some new stuff in development☆12Updated this week
- MEXMA: Token-level objectives improve sentence representations☆41Updated 4 months ago
- flow-merge is a powerful Python library that enables seamless merging of multiple transformer-based language models using the most popula…☆17Updated 3 months ago
- [Oral; Neurips OPT2024 ] μLO: Compute-Efficient Meta-Generalization of Learned Optimizers☆12Updated 2 months ago
- Latest Weight Averaging (NeurIPS HITY 2022)☆30Updated last year
- Code, results and other artifacts from the paper introducing the WildChat-50m dataset and the Re-Wild model family.☆29Updated last month
- Work in progress.☆62Updated last month
- ☆13Updated 4 months ago
- Official PyTorch implementation of "LayerMerge: Neural Network Depth Compression through Layer Pruning and Merging" (ICML 2024)☆29Updated 9 months ago
- One-stop solutions for Mixture of Experts and Mixture of Depth modules in PyTorch.☆22Updated 3 weeks ago
- Contains my experiments with the `big_vision` repo to train ViTs on ImageNet-1k.☆22Updated 2 years ago
- An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO☆29Updated this week
- ☆68Updated 10 months ago
- This repository contains code for the MicroAdam paper.☆18Updated 5 months ago
- Official Repository for Task-Circuit Quantization☆20Updated 2 weeks ago
- From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients. Ajay Jaiswal, Lu Yin, Zhenyu Zhang, Shiwei Liu,…☆47Updated 3 weeks ago
- Official repository of "LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging"☆26Updated 6 months ago
- sigma-MoE layer☆18Updated last year
- ☆31Updated last year