transmuteAI / trailmet
Transmute AI Lab Model Efficiency Toolkit
☆19Updated last year
Alternatives and similar repositories for trailmet:
Users that are interested in trailmet are comparing it to the libraries listed below
- ☆42Updated last year
- Collection of autoregressive model implementation☆85Updated this week
- The official repository for HyperZ⋅Z⋅W Operator Connects Slow-Fast Networks for Full Context Interaction.☆36Updated 3 weeks ago
- E2E AutoML Model Compression Package☆47Updated last month
- Official repository for the paper "Approximating Two-Layer Feedforward Networks for Efficient Transformers"☆37Updated last year
- Official code for the paper "Attention as a Hypernetwork"☆30Updated 10 months ago
- Notebook and Scripts that showcase running quantized diffusion models on consumer GPUs☆38Updated 6 months ago
- Latest Weight Averaging (NeurIPS HITY 2022)☆30Updated last year
- JAX Scalify: end-to-end scaled arithmetics☆16Updated 5 months ago
- High-performance, asynchronous Python HTTP client library designed for faster file transfers using concurrency, semaphores, and fault-tol…☆55Updated 2 weeks ago
- Official implementation of ECCV24 paper: POA☆24Updated 8 months ago
- Work in progress.☆56Updated 2 weeks ago
- Contains my experiments with the `big_vision` repo to train ViTs on ImageNet-1k.☆22Updated 2 years ago
- This repository contains code for the MicroAdam paper.☆18Updated 4 months ago
- ☆43Updated last year
- Pokedex for LLMs☆11Updated 2 weeks ago
- LLM attention pattern visualizer☆10Updated last year
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆54Updated last year
- Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model☆45Updated last year
- Official Repository for Task-Circuit Quantization☆15Updated 2 weeks ago
- Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.☆17Updated last month
- Code for "Accelerating Training with Neuron Interaction and Nowcasting Networks" [to appear at ICLR 2025]☆19Updated last month
- A collection of various LLM sampling methods implemented in pure Pytorch☆23Updated 4 months ago
- flow-merge is a powerful Python library that enables seamless merging of multiple transformer-based language models using the most popula…☆17Updated 2 months ago
- From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients. Ajay Jaiswal, Lu Yin, Zhenyu Zhang, Shiwei Liu,…☆44Updated last week
- ☆22Updated last year
- Minimal pretraining script for language modeling in PyTorch. Supporting torch compilation and DDP. It includes a model implementation and…☆12Updated last week
- This repo is based on https://github.com/jiaweizzhao/GaLore☆26Updated 7 months ago
- Utilities for Training Very Large Models☆58Updated 7 months ago
- ☆27Updated 2 months ago