TomerRonen34 / mixed-resolution-vitView external linksLinks
☆56Sep 28, 2023Updated 2 years ago
Alternatives and similar repositories for mixed-resolution-vit
Users that are interested in mixed-resolution-vit are comparing it to the libraries listed below
Sorting:
- CVPR2023: Vector Quantization with Self-Attention for Quality-Independent Representation Learning.☆14May 17, 2024Updated last year
- Code for the paper "MULTI-BAND MASKING FOR WAVEFORM-BASED SINGING VOICE SEPARATION" that was accepted on EUSIPCO2022☆15Jun 18, 2022Updated 3 years ago
- A Spitting Image: Modular Superpixel Tokenization in Vision Transformers☆21Sep 12, 2025Updated 5 months ago
- ☆24Jun 13, 2022Updated 3 years ago
- Transformer eXplainability and eXploration☆20Oct 24, 2024Updated last year
- 🔊 A comprehensive list of open-source datasets for voice and sound computing (50+ datasets).☆20Apr 1, 2021Updated 4 years ago
- Bilingual Singing Voice Synthesis☆18Mar 25, 2024Updated last year
- Utilities for PyTorch distributed☆25Feb 27, 2025Updated 11 months ago
- Implementation of the ALI-G algorithm (PyTorch, Tensorflow)☆22Mar 7, 2021Updated 4 years ago
- Majesty Diffusion by @Dango233 and @apolinario (@multimodalart)☆25Jul 26, 2022Updated 3 years ago
- Baseline for DCASE 2024 Task 9: "Language-Queried Audio Source Separation"☆26Mar 27, 2024Updated last year
- ☆58Oct 6, 2023Updated 2 years ago
- An official code release of the paper RGB no more: Minimally Decoded JPEG Vision Transformers☆57Jul 11, 2023Updated 2 years ago
- Keras implementation of musicnn, a set of pre-trained deep convolutional neural networks for music audio tagging☆27May 17, 2021Updated 4 years ago
- Official PyTorch implementation of our ECCV 2022 paper "Sliced Recursive Transformer"☆66Sep 6, 2022Updated 3 years ago
- [ICCV23] Official implementation of eP-ALM: Efficient Perceptual Augmentation of Language Models.☆27Oct 27, 2023Updated 2 years ago
- Experimental LDM uses of Paella's architecture☆34Jan 26, 2023Updated 3 years ago
- ☆30Jul 4, 2021Updated 4 years ago
- Superpixel Tokenization for Vision Transformers: Preserving Semantic Integrity in Visual Tokens☆44Mar 24, 2025Updated 10 months ago
- PyTorch reimplementation of FlexiViT: One Model for All Patch Sizes☆66May 5, 2024Updated last year
- 2nd place solution for the RSNA STR Pulmonary Embolism Detection competition on Kaggle.☆29Nov 29, 2020Updated 5 years ago
- Official implementation for Wavelet Feature Maps Compression for Image-to-Image CNNs, NeurIPS 2022.☆37Oct 12, 2022Updated 3 years ago
- Official implementation of GANmut (CVPR 2021)☆33Dec 28, 2022Updated 3 years ago
- This repository hosts code for converting the original Vision Transformer models (JAX) to TensorFlow.☆33Mar 23, 2022Updated 3 years ago
- Research code for "Training Vision-Language Transformers from Captions Alone"☆33Jul 15, 2022Updated 3 years ago
- [ACL 2023] PuMer: Pruning and Merging Tokens for Efficient Vision Language Models☆36Oct 3, 2024Updated last year
- SC-CNN: Effective Speaker Conditioning Method for Zero-Shot Multi-Speaker Text-to-Speech Systems☆39Nov 1, 2023Updated 2 years ago
- (ACL'2023) MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning☆36Aug 8, 2024Updated last year
- A Python implementation of the Hopfield network used to solve the traveling salesman problem☆10Apr 11, 2019Updated 6 years ago
- Code for the paper "Timbre-Trap: A Low-Resource Framework for Instrument-Agnostic Music Transcription"☆40May 5, 2024Updated last year
- Face Verification Example with Flower / Federated Learning☆12Apr 3, 2023Updated 2 years ago
- ☆11Jun 22, 2025Updated 7 months ago
- Extract information from XBRL files in the ESEF format☆13Jan 3, 2026Updated last month
- Probing the representations of Vision Transformers.☆338Oct 5, 2022Updated 3 years ago
- [ICLR 2021 Spotlight Oral] "Undistillable: Making A Nasty Teacher That CANNOT teach students", Haoyu Ma, Tianlong Chen, Ting-Kuei Hu, Che…☆82Dec 30, 2021Updated 4 years ago
- Explorations into adversarial losses on top of autoregressive loss for language modeling☆41Dec 21, 2025Updated last month
- Scalable Diffusion Models with State Space Backbone☆157Mar 7, 2024Updated last year
- [ICLR 2025] Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic Soundscapes☆57Oct 8, 2025Updated 4 months ago
- ☆10May 4, 2023Updated 2 years ago