BorealisAI / neuzip
Official repository for the paper "NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks". This repository contains the code for the experiments in the paper.
☆58Updated 5 months ago
Alternatives and similar repositories for neuzip:
Users that are interested in neuzip are comparing it to the libraries listed below
- Work in progress.☆56Updated 2 weeks ago
- FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive Distillation☆48Updated 9 months ago
- Training-free Post-training Efficient Sub-quadratic Complexity Attention. Implemented with OpenAI Triton.☆128Updated this week
- RWKV-7: Surpassing GPT☆83Updated 5 months ago
- A repository for research on medium sized language models.☆76Updated 10 months ago
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆54Updated last year
- ☆122Updated 3 weeks ago
- ☆46Updated 9 months ago
- This repository contains code for the MicroAdam paper.☆18Updated 4 months ago
- Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"☆97Updated 6 months ago
- QuIP quantization☆51Updated last year
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆36Updated 11 months ago
- Implementation of Mind Evolution, Evolving Deeper LLM Thinking, from Deepmind☆47Updated 2 months ago
- Code accompanying the paper "Generalized Interpolating Discrete Diffusion"☆74Updated last month
- ☆50Updated 5 months ago
- RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best…☆44Updated last month
- ☆52Updated last month
- Train, tune, and infer Bamba model☆88Updated 3 months ago
- ☆79Updated 5 months ago
- PB-LLM: Partially Binarized Large Language Models☆151Updated last year
- An extention to the GaLore paper, to perform Natural Gradient Descent in low rank subspace☆15Updated 6 months ago
- Block Transformer: Global-to-Local Language Modeling for Fast Inference (NeurIPS 2024)☆152Updated last week
- PyTorch implementation of models from the Zamba2 series.☆179Updated 2 months ago
- A toolkit for fine-tuning, inferencing, and evaluating GreenBitAI's LLMs.☆82Updated last month
- Tiny re-implementation of MDM in style of LLaDA and nano-gpt speedrun☆47Updated last month
- This repo is based on https://github.com/jiaweizzhao/GaLore☆26Updated 7 months ago
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.☆197Updated 9 months ago
- A list of language models with permissive licenses such as MIT or Apache 2.0☆24Updated last month
- Layer-Condensed KV cache w/ 10 times larger batch size, fewer params and less computation. Dramatic speed up with better task performance…☆148Updated 2 weeks ago
- ☆76Updated 3 months ago