ICCV2021 / Autoformer
☆16Updated 3 years ago
Related projects ⓘ
Alternatives and complementary repositories for Autoformer
- BM-NAS: Bilevel Multimodal Neural Architecture Search (AAAI 2022 Oral)☆18Updated last year
- Sparse Attention with Linear Units☆17Updated 3 years ago
- Learning Efficient Vision Transformers via Fine-Grained Manifold Distillation. NeurIPS 2022.☆29Updated 2 years ago
- PyTorch implementation of RealFormer: Transformer Likes Residual Attention☆11Updated 3 years ago
- ☆20Updated 2 years ago
- ☆13Updated 3 years ago
- code for Explicit Sparse Transformer☆56Updated last year
- [ICLR 2024] Hebbian Learning based Orthogonal Projection for Continual Learning of Spiking Neural Networks☆28Updated 9 months ago
- Implementation of Switch Transformers from the paper: "Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficien…☆55Updated last week
- custom pytorch implementation of MoCo v3☆44Updated 3 years ago
- Fire Together Wire Together: A Dynamic Pruning Approach with Self-Supervised Mask Prediction☆10Updated 2 years ago
- A repository for DenseSSMs☆88Updated 7 months ago
- [ACL 2023] Code for paper “Tailoring Instructions to Student’s Learning Levels Boosts Knowledge Distillation”(https://arxiv.org/abs/2305.…☆37Updated last year
- This is an official implementation of our CVPR 2020 paper "Non-Local Neural Networks With Grouped Bilinear Attentional Transforms".☆12Updated 3 years ago
- ☆41Updated 7 months ago
- A simple program scheduler for your code on different devices.☆11Updated 3 months ago
- A Tight-fisted Optimizer☆47Updated last year
- ☆27Updated 2 years ago
- The repo for reproducing the main results in TSMixer: An all-MLP Architecture for Time Series Forecasting.☆10Updated last year
- ☆24Updated 3 years ago
- Auto^6ML is a jittor library allowing users to achieve machine learning automation.☆24Updated last month
- Implementation of the paper "Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting", https://arxi…☆18Updated 3 years ago
- [ECCV 2022] AMixer: Adaptive Weight Mixing for Self-attention Free Vision Transformers☆28Updated 2 years ago
- [ICLR 2022] "Unified Vision Transformer Compression" by Shixing Yu*, Tianlong Chen*, Jiayi Shen, Huan Yuan, Jianchao Tan, Sen Yang, Ji Li…☆48Updated 11 months ago
- ☆35Updated 2 years ago
- [CVPR '23] PA&DA: Jointly Sampling PAth and DAta for Consistent NAS☆34Updated last year
- Mixture of Attention Heads☆39Updated 2 years ago
- State Space Models☆63Updated 6 months ago
- Implementation of Denoising Diffusion Probabilistic Model in MindSpore☆32Updated last year
- 基于Gated Attention Unit的Transformer模型(尝鲜版)☆97Updated last year