ultmaster / utilsd
Common deep learning utils.
☆18Updated last year
Alternatives and similar repositories for utilsd
Users that are interested in utilsd are comparing it to the libraries listed below
Sorting:
- code for RIM☆22Updated 2 years ago
- This package implements THOR: Transformer with Stochastic Experts.☆62Updated 3 years ago
- Pytorch library for factorized L0-based pruning.☆45Updated last year
- Deep learning images developed from nvidia/cuda-cudnn-devel-ubuntu.☆23Updated 2 years ago
- Differentiable Product Quantization for End-to-End Embedding Compression.☆62Updated 2 years ago
- ☆84Updated 4 years ago
- Efficient Neural Interaction Functions Search for Collaborative Filtering☆18Updated 5 years ago
- Source code for our AAAI'22 paper 《From Dense to Sparse: Contrastive Pruning for Better Pre-trained Language Model Compression》☆24Updated 3 years ago
- [ACL-IJCNLP 2021] "EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets" by Xiaohan Chen, Yu Cheng, Shuohang Wang, Zhe Gan, …☆18Updated 3 years ago
- ☆33Updated 3 years ago
- ☆18Updated 5 years ago
- This repo is to demo the concept of lossless compression with Transformers as encoder and decoder.☆14Updated last year
- ☆16Updated 3 years ago
- Codes for Causal Semantic Generative model (CSG), the model proposed in "Learning Causal Semantic Representation for Out-of-Distribution …☆73Updated 3 years ago
- ☆20Updated 5 years ago
- Crawl & visualize ICLR papers and reviews.☆18Updated 2 years ago
- Code for the ACL-2022 paper "StableMoE: Stable Routing Strategy for Mixture of Experts"☆45Updated 2 years ago
- ☆12Updated 2 years ago
- Code for COMET: Cardinality Constrained Mixture of Experts with Trees and Local Search☆10Updated last year
- MetaBalance algorithm for multi-task learning☆58Updated 3 years ago
- Implementation of a Quantized Transformer Model☆19Updated 6 years ago
- [NeurIPS 2022] DreamShard: Generalizable Embedding Table Placement for Recommender Systems☆29Updated 2 years ago
- Distributed DataLoader For Pytorch Based On Ray☆24Updated 3 years ago
- Official PyTorch Implementation of Deep Networks from the Principle of Rate Reduction (2021)☆37Updated 4 years ago
- ☆81Updated 9 months ago
- STABILIZING GRADIENTS FOR DEEP NEURAL NETWORKS VIA EFFICIENT SVD PARAMETERIZATION☆16Updated 6 years ago
- Code for the PAPA paper☆27Updated 2 years ago
- [KDD 2022] AutoShard: Automated Embedding Table Sharding for Recommender Systems☆22Updated 2 years ago
- Code for the ICML 2021 paper "Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation", Haoxi…☆68Updated 3 years ago
- A Kernel-Based View of Language Model Fine-Tuning https://arxiv.org/abs/2210.05643☆76Updated last year