microsoft / TextNAS
This is the implementation of the TextNAS algorithm proposed in the paper TextNAS: A Neural Architecture Search Space tailored for Text Representation.
☆15Updated 2 years ago
Alternatives and similar repositories for TextNAS:
Users that are interested in TextNAS are comparing it to the libraries listed below
- ☆199Updated last year
- AutoMoE: Neural Architecture Search for Efficient Sparsely Activated Transformers☆42Updated 2 years ago
- RL algorithm: Advantage induced policy alignment☆62Updated last year
- [NeurIPS'23] Speculative Decoding with Big Little Decoder☆88Updated 11 months ago
- Renee: End-to-end training of extreme classification models☆21Updated last year
- AI Assistant for Building Reliable, High-performing and Fair Multilingual NLP Systems☆45Updated 2 years ago
- Research and development for optimizing transformers☆125Updated 3 years ago
- Factorized Neural Layers☆27Updated last year
- some common Huggingface transformers in maximal update parametrization (µP)☆78Updated 2 years ago
- [ICML'21 Oral] I-BERT: Integer-only BERT Quantization☆234Updated last year
- This package implements THOR: Transformer with Stochastic Experts.☆61Updated 3 years ago
- NAACL '24 (Best Demo Paper RunnerUp) / MlSys @ NeurIPS '23 - RedCoast: A Lightweight Tool to Automate Distributed Training and Inference☆62Updated last month
- Fast sparse deep learning on CPUs☆51Updated 2 years ago
- ☆91Updated 7 months ago
- Official Pytorch Implementation of Length-Adaptive Transformer (ACL 2021)☆100Updated 4 years ago
- [KDD'22] Learned Token Pruning for Transformers☆96Updated last year
- pytorch-profiler☆50Updated last year
- An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.☆19Updated 2 years ago
- Generative Retrieval Transformer☆28Updated last year
- Lightweight Deep Learning Model Training library based on PyTorch☆32Updated 2 years ago
- ☆197Updated 3 years ago
- [ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators☆24Updated last year
- ☆96Updated 4 months ago
- ☆57Updated 7 months ago
- Block Sparse movement pruning☆78Updated 4 years ago
- 🤗 Transformers: State-of-the-art Natural Language Processing for TensorFlow 2.0 and PyTorch.☆14Updated this week
- Implémentation of the article **Deep Learning CUDA Memory Usage and Pytorch optimization tricks**☆43Updated 5 years ago
- We view Large Language Models as stochastic language layers in a network, where the learnable parameters are the natural language prompts…☆93Updated 5 months ago
- A tracing JIT for PyTorch☆17Updated 2 years ago
- Repo for WWW 2022 paper: Progressively Optimized Bi-Granular Document Representation for Scalable Embedding Based Retrieval☆15Updated 2 years ago