Unofficial PyTorch implementation of the paper "cosFormer: Rethinking Softmax In Attention".
☆44Oct 29, 2021Updated 4 years ago
Alternatives and similar repositories for cosformer-pytorch
Users that are interested in cosformer-pytorch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICLR 2022] Official implementation of cosformer-attention in cosFormer: Rethinking Softmax in Attention☆199Dec 2, 2022Updated 3 years ago
- ☆14May 3, 2022Updated 4 years ago
- A library of speech gadgets.☆14Oct 15, 2022Updated 3 years ago
- [TPAMI 2023] This is an official implementation for "Vicinity Vision Transformer".☆22Jun 15, 2023Updated 2 years ago
- ☆16Dec 23, 2021Updated 4 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆17Sep 15, 2024Updated last year
- Official implementation of "ExpoMamba: Exploiting Frequency SSM Blocks for Efficient and Effective Image Enhancement", Accepted in ICML E…☆25Oct 30, 2024Updated last year
- Question and answer retrieval in Turkish with BERT☆14Nov 30, 2021Updated 4 years ago
- Transformer based ASR Engine.☆13Aug 23, 2021Updated 4 years ago
- Korean Abstract Meaning Representation (AMR) Corpus☆10Feb 27, 2022Updated 4 years ago
- Implementation of N-Grammer, augmenting Transformers with latent n-grams, in Pytorch☆76Dec 4, 2022Updated 3 years ago
- Codebase for "Channel selection using Gumbel Softmax"☆19Jan 20, 2021Updated 5 years ago
- Official Pytorch Implementation for the paper 'SUPER-ADAM: Faster and Universal Framework of Adaptive Gradients'☆17Jan 12, 2022Updated 4 years ago
- Source code repo for paper "TLDR: Token Loss Dynamic Reweighting for Reducing Repetitive Utterance Generation"☆10Aug 11, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- [EMNLP 2023] Official implementation of the algorithm ETSC: Exact Toeplitz-to-SSM Conversion our EMNLP 2023 paper - Accelerating Toeplitz…☆14Oct 17, 2023Updated 2 years ago
- The accompanying code for "Memory-efficient Transformers via Top-k Attention" (Ankit Gupta, Guy Dar, Shaya Goodman, David Ciprut, Jonatha…☆70Sep 19, 2021Updated 4 years ago
- Presents an optimized Apache Beam pipeline for generating sentence embeddings (runnable on Cloud Dataflow).☆20Mar 7, 2022Updated 4 years ago
- ☆10Jun 28, 2022Updated 3 years ago
- Leveraging Local and Global Patterns for Self-Attention Networks☆12Jun 3, 2019Updated 6 years ago
- [SIGGRAPH ASIA 2024] Frankenstein: Generating Semantic-Compositional 3D Scenes in One Tri-Plane☆20Nov 25, 2024Updated last year
- Cardiovascular disease dataset analysis for Data Science for Health (COSC 89.20)☆24May 10, 2019Updated 7 years ago
- Tacotron2 with BERT examples☆10Jul 8, 2019Updated 6 years ago
- ☆12Dec 11, 2020Updated 5 years ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- Automatic gain control library☆15Jul 13, 2024Updated last year
- 매주 목요일, 20:00 모임☆16Jul 24, 2020Updated 5 years ago
- Korean Visual Question Answering☆59Feb 18, 2020Updated 6 years ago
- ☆34Nov 30, 2023Updated 2 years ago
- 完全独立编译 AEC, AGC, NS, VAD in WebRTC☆22Jul 8, 2019Updated 6 years ago
- Serving files for hungry LLMs☆26Mar 9, 2026Updated 2 months ago
- Implementation of Multistream Transformers in Pytorch☆54Jul 31, 2021Updated 4 years ago
- scipts for working with open.bible data☆26Jan 24, 2022Updated 4 years ago
- [CVPR2022] "Progressive End-to-End Object Detection in Crowded Scenes" on Deformable-DETR.☆32Nov 4, 2022Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Multi-modal data augmentation for machine learning☆16Jun 4, 2019Updated 6 years ago
- [TOG 2024] BlockFusion: Expandable 3D Scene Generation using Latent Tri-plane Extrapolation☆16Jun 14, 2024Updated last year
- O'Reilly Course, In-Memory Computing Essentials☆10Oct 16, 2020Updated 5 years ago
- [ICCV 2021] Official implementation of "Scalable Vision Transformers with Hierarchical Pooling"☆32Dec 30, 2021Updated 4 years ago
- Adversarial Test Dataset for Korean Multi-turn Response Selection☆34Dec 16, 2021Updated 4 years ago
- ☆20Apr 17, 2023Updated 3 years ago
- The official repository for our paper "The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization".☆34Jun 11, 2025Updated 11 months ago