apd10 / universal_memory_allocationLinks
☆15Updated 3 years ago
Alternatives and similar repositories for universal_memory_allocation
Users that are interested in universal_memory_allocation are comparing it to the libraries listed below
Sorting:
- Differentiable Product Quantization for End-to-End Embedding Compression.☆64Updated 3 years ago
- High performance pytorch modules☆18Updated 2 years ago
- A Learnable LSH Framework for Efficient NN Training☆34Updated 4 years ago
- ☆14Updated 3 years ago
- Implementation of a Transformer using ReLA (Rectified Linear Attention) from https://arxiv.org/abs/2104.07012☆49Updated 3 years ago
- A python library for highly configurable transformers - easing model architecture search and experimentation.☆49Updated 4 years ago
- ☆15Updated 4 years ago
- ☆28Updated 6 years ago
- Implementation of vector quantization algorithms, codes for Norm-Explicit Quantization: Improving Vector Quantization for Maximum Inner P…☆59Updated 5 years ago
- Codes of the paper Deformable Butterfly: A Highly Structured and Sparse Linear Transform.☆13Updated 4 years ago
- AdamW optimizer for bfloat16 models in pytorch 🔥.☆39Updated last year
- [ NeurIPS '22 ] Data distillation for recommender systems. Shows equivalent performance with 2-3 orders less data.☆23Updated 2 years ago
- Large Scale Graphical Model☆24Updated 6 years ago
- Ancestral Gumbel-Top-k Sampling☆25Updated 5 years ago
- ☆32Updated 2 years ago
- This is a Tensor Train based compression library to compress sparse embedding tables used in large-scale machine learning models such as …☆194Updated 3 years ago
- A deep learning library based on Pytorch focussed on low resource language research and robustness☆70Updated 4 years ago
- Fine-Tuning Pre-trained Transformers into Decaying Fast Weights☆19Updated 3 years ago
- A study of the downstream instability of word embeddings☆12Updated 3 years ago
- AdaBatch: Adaptive Batch Sizes for Training Deep Neural Networks☆42Updated 8 years ago
- Extremely simple and fast extreme multi-class and multi-label classifiers.☆70Updated last month
- sigma-MoE layer☆20Updated 2 years ago
- Hyperparameter tuning via uncertainty modeling☆49Updated last year
- ☆29Updated 3 years ago
- Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method (NeurIPS 2021)☆63Updated 3 years ago
- Code for COMET: Cardinality Constrained Mixture of Experts with Trees and Local Search☆11Updated 2 years ago
- A collection of optimizers, some arcane others well known, for Flax.☆29Updated 4 years ago
- [EMNLP'19] Summary for Transformer Understanding☆53Updated 6 years ago
- An attempt to merge ESBN with Transformers, to endow Transformers with the ability to emergently bind symbols☆16Updated 4 years ago
- Code for paper 'Minimizing FLOPs to Learn Efficient Sparse Representations' published at ICLR 2020☆19Updated 5 years ago