microsoft / TextNASLinks
This is the implementation of the TextNAS algorithm proposed in the paper TextNAS: A Neural Architecture Search Space tailored for Text Representation.
☆15Updated 3 years ago
Alternatives and similar repositories for TextNAS
Users that are interested in TextNAS are comparing it to the libraries listed below
Sorting:
- Official code for "Distributed Deep Learning in Open Collaborations" (NeurIPS 2021)☆118Updated 3 years ago
- ☆252Updated last year
- Training material for IPU users: tutorials, feature examples, simple applications☆87Updated 2 years ago
- Research and development for optimizing transformers☆131Updated 4 years ago
- Block Sparse movement pruning☆81Updated 5 years ago
- [ACL'20] HAT: Hardware-Aware Transformers for Efficient Natural Language Processing☆336Updated last year
- ☆221Updated 2 years ago
- ☆16Updated 4 years ago
- OSLO: Open Source framework for Large-scale model Optimization☆309Updated 3 years ago
- Renee: End-to-end training of extreme classification models☆23Updated 2 years ago
- ☆78Updated last year
- DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.☆171Updated 2 months ago
- Official Pytorch Implementation of Length-Adaptive Transformer (ACL 2021)☆102Updated 5 years ago
- Implémentation of the article **Deep Learning CUDA Memory Usage and Pytorch optimization tricks**☆43Updated 5 years ago
- OSLO: Open Source for Large-scale Optimization☆174Updated 2 years ago
- Scalable PaLM implementation of PyTorch☆189Updated 2 years ago
- [JMLR'20] NeurIPS 2019 MicroNet Challenge Efficient Language Modeling, Champion☆41Updated 4 years ago
- Blazing fast training of 🤗 Transformers on Graphcore IPUs☆85Updated last year
- Torch Distributed Experimental☆117Updated last year
- Factorized Neural Layers☆31Updated 2 years ago
- Implementation of a Transformer, but completely in Triton☆277Updated 3 years ago
- Pytorch library for factorized L0-based pruning.☆45Updated 2 years ago
- The Triton backend for the PyTorch TorchScript models.☆165Updated last week
- ☆87Updated 3 years ago
- Prune a model while finetuning or training.☆404Updated 3 years ago
- ☆363Updated last year
- Simple implementation of Speculative Sampling in NumPy for GPT-2.☆98Updated 2 years ago
- Implementation of the specific Transformer architecture from PaLM - Scaling Language Modeling with Pathways - in Jax (Equinox framework)☆189Updated 3 years ago
- ☆66Updated 3 years ago
- AutoMoE: Neural Architecture Search for Efficient Sparsely Activated Transformers☆48Updated 3 years ago