tt-embedding / tt-embeddings
☆27Updated 5 years ago
Related projects ⓘ
Alternatives and complementary repositories for tt-embeddings
- ☆61Updated 4 years ago
- Pytorch library for factorized L0-based pruning.☆43Updated last year
- Compression of NMT transformer model with tensor methods☆48Updated 5 years ago
- u-MPS implementation and experimentation code used in the paper Tensor Networks for Probabilistic Sequence Modeling (https://arxiv.org/ab…☆19Updated 4 years ago
- [ICLR 2022] Code for paper "Exploring Extreme Parameter Compression for Pre-trained Language Models"(https://arxiv.org/abs/2205.10036)☆19Updated last year
- ☆62Updated 3 years ago
- A fully tensorized recurrent neural network using tensor-train decomposition☆25Updated last year
- An implementation of various tensor-based decomposition for NN & RNN parameters☆18Updated 6 years ago
- Structured matrices for compressing neural networks☆67Updated last year
- ☆11Updated 2 years ago
- This package implements THOR: Transformer with Stochastic Experts.☆61Updated 3 years ago
- Block Sparse movement pruning☆78Updated 3 years ago
- Block-sparse primitives for PyTorch☆148Updated 3 years ago
- A custom PyTorch layer that is capable of implementing extremely wide and sparse linear layers efficiently☆48Updated 11 months ago
- ☆15Updated 2 years ago
- [NeurIPS 2020] "The Lottery Ticket Hypothesis for Pre-trained BERT Networks", Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Ya…☆138Updated 2 years ago
- ☆32Updated 3 years ago
- This pytorch package implements PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance (ICML 2022).☆41Updated 2 years ago
- Repository containing Pytorch code for EKFAC and K-FAC perconditioners.☆140Updated last year
- ICLR 2021☆44Updated 3 years ago
- CUDA kernels for generalized matrix-multiplication in PyTorch☆79Updated 3 years ago
- STABILIZING GRADIENTS FOR DEEP NEURAL NETWORKS VIA EFFICIENT SVD PARAMETERIZATION☆16Updated 6 years ago
- Codes for "Understanding and Improving Transformer From a Multi-Particle Dynamic System Point of View"☆145Updated 5 years ago
- Code for the paper: "Tensor Programs II: Neural Tangent Kernel for Any Architecture"☆97Updated 4 years ago
- Metamodeling, sensitivity analysis and visualization using the tensor train format☆23Updated 2 years ago
- Distributed K-FAC Preconditioner for PyTorch☆80Updated this week
- MLPruning, PyTorch, NLP, BERT, Structured Pruning☆21Updated 3 years ago
- Research and development for optimizing transformers☆125Updated 3 years ago
- ☆10Updated 4 years ago
- The official repository for our paper "The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization".☆32Updated 3 years ago