microsoft / TextNAS

This is the implementation of the TextNAS algorithm proposed in the paper TextNAS: A Neural Architecture Search Space tailored for Text Representation.

☆15

Alternatives and similar repositories for TextNAS:

Users that are interested in TextNAS are comparing it to the libraries listed below

HazyResearch / fly
☆199Updated last year
microsoft / AutoMoE
AutoMoE: Neural Architecture Search for Efficient Sparsely Activated Transformers
☆42Updated 2 years ago
microsoft / RLHF-APA
RL algorithm: Advantage induced policy alignment
☆62Updated last year
kssteven418 / BigLittleDecoder
[NeurIPS'23] Speculative Decoding with Big Little Decoder
☆88Updated 11 months ago
microsoft / renee
Renee: End-to-end training of extreme classification models
☆21Updated last year
microsoft / Litmus
AI Assistant for Building Reliable, High-performing and Fair Multilingual NLP Systems
☆45Updated 2 years ago
spcl / substation
Research and development for optimizing transformers
☆125Updated 3 years ago
microsoft / fnl_paper
Factorized Neural Layers
☆27Updated last year
microsoft / mutransformers
some common Huggingface transformers in maximal update parametrization (µP)
☆78Updated 2 years ago
kssteven418 / I-BERT
[ICML'21 Oral] I-BERT: Integer-only BERT Quantization
☆234Updated last year
microsoft / Stochastic-Mixture-of-Experts
This package implements THOR: Transformer with Stochastic Experts.
☆61Updated 3 years ago
tanyuqian / redco
NAACL '24 (Best Demo Paper RunnerUp) / MlSys @ NeurIPS '23 - RedCoast: A Lightweight Tool to Automate Distributed Training and Inference
☆62Updated last month
marsupialtail / sparsednn
Fast sparse deep learning on CPUs
☆51Updated 2 years ago
huawei-noah / Efficient-NLP
☆91Updated 7 months ago
clovaai / length-adaptive-transformer
Official Pytorch Implementation of Length-Adaptive Transformer (ACL 2021)
☆100Updated 4 years ago
kssteven418 / LTP
[KDD'22] Learned Token Pruning for Transformers
☆96Updated last year
cli99 / flops-profiler
pytorch-profiler
☆50Updated last year
microsoft / deepspeed-gpt-neox
An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
☆19Updated 2 years ago
microsoft / GRTr
Generative Retrieval Transformer
☆28Updated last year
microsoft / PyMarlin
Lightweight Deep Learning Model Training library based on PyTorch
☆32Updated 2 years ago
Qualcomm-AI-research / transformer-quantization
☆197Updated 3 years ago
microsoft / AMOS
[ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators
☆24Updated last year
stanford-futuredata / stk
☆96Updated 4 months ago
microsoft / DeepSpeed-Kernels
☆57Updated 7 months ago
huggingface / block_movement_pruning
Block Sparse movement pruning
☆78Updated 4 years ago
pytorch-tpu / transformers
🤗 Transformers: State-of-the-art Natural Language Processing for TensorFlow 2.0 and PyTorch.
☆14Updated this week
quentinf00 / article-memory-log
Implémentation of the article **Deep Learning CUDA Memory Usage and Pytorch optimization tricks**
☆43Updated 5 years ago
microsoft / deep-language-networks
We view Large Language Models as stochastic language layers in a network, where the learnable parameters are the natural language prompts…
☆93Updated 5 months ago
microsoft / torchy
A tracing JIT for PyTorch
☆17Updated 2 years ago
microsoft / BiDR
Repo for WWW 2022 paper: Progressively Optimized Bi-Granular Document Representation for Scalable Embedding Based Retrieval
☆15Updated 2 years ago