Lightweight Transformer for Multi-modal Tasks
☆16Dec 9, 2022Updated 3 years ago
Alternatives and similar repositories for LWTransformer
Users that are interested in LWTransformer are comparing it to the libraries listed below
Sorting:
- SOIT: Segmenting Objects with Instance-Aware Transformers☆14Jun 6, 2022Updated 3 years ago
- ☆19Jan 7, 2026Updated last month
- Official implementation for "SimA: Simple Softmax-free Attention for Vision Transformers"☆46Apr 18, 2024Updated last year
- ☆16Jul 20, 2022Updated 3 years ago
- Various test models in WNNX format. It can view with `pip install wnetron && wnetron`☆12Jun 22, 2022Updated 3 years ago
- This is an official implementation of our CVPR 2020 paper "Non-Local Neural Networks With Grouped Bilinear Attentional Transforms".☆12Jan 30, 2021Updated 5 years ago
- pytorch implementation of Semantics-AssistedVideoCaptioning☆11Feb 16, 2023Updated 3 years ago
- Code for paper: Unified Text-to-Image Generation and Retrieval☆16Jul 6, 2024Updated last year
- LV-BERT: Exploiting Layer Variety for BERT (Findings of ACL 2021)☆18May 10, 2023Updated 2 years ago
- Code for "Searching for Efficient Multi-Stage Vision Transformers"☆63Sep 1, 2021Updated 4 years ago
- Optimized code based on M2 for faster image captioning training☆21Nov 18, 2022Updated 3 years ago
- MMPD Dataset from ECCV'2024 "When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model and Benchmark Dataset"☆21Jul 15, 2024Updated last year
- ☆21Feb 3, 2025Updated last year
- Offical PyTorch implementation of Clover: Towards A Unified Video-Language Alignment and Fusion Model (CVPR2023)☆40Feb 15, 2023Updated 3 years ago
- Scene Graph Generate Zero Shot☆23Apr 16, 2023Updated 2 years ago
- Implementation of the paper ''Implicit Feature Refinement for Instance Segmentation''.☆20Oct 27, 2021Updated 4 years ago
- ☆22Jun 30, 2023Updated 2 years ago
- Adaptive Split-Fusion Transformer (ICME 2023 Oral)☆17Feb 19, 2024Updated 2 years ago
- CVPR2022 - Language-Bridged Spatial-Temporal Interaction for Referring Video Object Segmentation☆24Aug 12, 2022Updated 3 years ago
- Unofficial PyTorch implementation of "Filter Response Normalization Layer: Eliminating Batch Dependence in the Training of Deep Neural Ne…☆22Dec 19, 2019Updated 6 years ago
- ☆60Nov 3, 2022Updated 3 years ago
- video captioning☆24Mar 14, 2019Updated 6 years ago
- [NeurIPS'24] I2EBench: A Comprehensive Benchmark for Instruction-based Image Editing☆33Dec 9, 2025Updated 2 months ago
- A pytorch implementation of “ X-Mesh: Towards Fast and Accurate Text-driven 3D Stylization via Dynamic Textual Guidance”☆29Jan 12, 2024Updated 2 years ago
- [ECCV 2022] AMixer: Adaptive Weight Mixing for Self-attention Free Vision Transformers☆29Nov 14, 2022Updated 3 years ago
- PyTorch implementation of "Dynamic Structure Pruning for Compressing CNNs" (AAAI 2023 Oral)☆27Jan 15, 2024Updated 2 years ago
- Official Implementation for paper "Referring Transformer: A One-step Approach to Multi-task Visual Grounding" Neurips 2021☆68May 26, 2022Updated 3 years ago
- ☆28Jan 8, 2023Updated 3 years ago
- [TNNLS'25] [MICCAI'24] A Parameter and Memory Efficient Transfer Learning Method☆34Oct 29, 2025Updated 4 months ago
- An education step by step implementation of SimCLR that accompanies the blogpost☆31Mar 31, 2022Updated 3 years ago
- ☆34Jul 4, 2024Updated last year
- PyTorch implementation of Dynamic Grouping Convolution and Groupable ConvNet with pre-trained G-ResNeXt models☆69Jul 2, 2020Updated 5 years ago
- Caffe implementation for Active Shift Layer(ASL)☆33Mar 20, 2019Updated 6 years ago
- Deep Multimodal Neural Architecture Search☆29Nov 15, 2020Updated 5 years ago
- Try to reproduce SuperPoint☆33Aug 13, 2019Updated 6 years ago
- Evaluate the performance of computer vision models and prompts for zero-shot models (Grounding DINO, CLIP, BLIP, DINOv2, ImageBind, model…☆37Oct 18, 2023Updated 2 years ago
- PixelFolder: An Efficient Progressive Pixel Synthesis Network for Image Generation (ECCV 2022)☆34Jul 21, 2022Updated 3 years ago
- [CVPR2020] Multi-task Collaborative Network for Joint Referring Expression Comprehension and Segmentation, CVPR2020 (oral)☆139Aug 4, 2022Updated 3 years ago
- Using tensorflow object detection api and openCV to calculate real world coordinates from top view with fixed height of the camera.☆10Jun 19, 2021Updated 4 years ago