cscribano / DCT-Former-Public
Public repository for "DCT-Former: Efficient Self-Attention withDiscrete Cosine Transform"
☆18Updated last year
Alternatives and similar repositories for DCT-Former-Public:
Users that are interested in DCT-Former-Public are comparing it to the libraries listed below
- ☆12Updated last year
- ☆12Updated last year
- [ICLR 2022] "Anti-Oversmoothing in Deep Vision Transformers via the Fourier Domain Analysis: From Theory to Practice" by Peihao Wang, Wen…☆79Updated last year
- Decoupled Kullback-Leibler Divergence Loss (DKL), NeurIPS 2024☆36Updated last week
- ☆11Updated last year
- ☆27Updated 2 years ago
- Public repository for the ICLR'23 paper "Few-shot domain adaptation for end-to-end communication"☆9Updated last year
- Code for "Score-based Generative Modeling Secretly Minimizes the Wasserstein Distance", NeurIPS 2022.☆16Updated 2 years ago
- Denoising Masked Autoencoders Help Robust Classification.☆60Updated last year
- ☆54Updated last year
- Deep Learning Model for Signal Data☆85Updated 5 years ago
- Implementation of the transformer proposed in "Building Blocks for a Complex-Valued Transformer Architecture"☆69Updated last year
- ICASSP 2024: Robust DOA estimation from deep acoustic imaging☆12Updated 10 months ago
- Test-time adaptation for speech recognition model by single utterance. The official implementation of "Listen, Adapt, Better WER: Source-…☆17Updated 2 years ago
- Transformer based Self-Attention for Complex Numbers☆12Updated 3 years ago
- [Oral; CVPR'22] Parametric Scattering Networks☆24Updated 10 months ago
- ☆33Updated 4 years ago
- ☆12Updated 4 years ago
- An official implementation of "Deep Joint Source-Channel Coding with Iterative Source Error Correction"☆18Updated last year
- ☆13Updated 3 years ago
- Transformers w/o Attention, based fully on MLPs☆93Updated 10 months ago
- ☆11Updated last year
- Unofficial implementation for the paper 'Improving Diffusion Models for Inverse Problems using Manifold Constraints'[https://arxiv.org/ab…☆11Updated 2 years ago
- ☆21Updated 2 years ago
- Official codebase for our paper "Joslim: Joint Widths and Weights Optimization for Slimmable Neural Networks"☆12Updated 3 years ago
- Advanced Dropout: A Model-free Methodology for Bayesian Dropout Optimization (IEEE TPAMI 2021)☆17Updated 3 years ago
- Sparse Attention with Linear Units☆17Updated 3 years ago
- ☆15Updated last year
- Official project page for Estimating the Rate-Distortion Function by Wasserstein Gradient Descent☆17Updated last year
- ☆35Updated 2 years ago