xiaogang00 / MTFormerLinks
This is the source code for the ECCV paper "MTFormer: Multi-Task Learning via Transformer and Cross-Task Reasoning"
β200Updated 3 years ago
Alternatives and similar repositories for MTFormer
Users that are interested in MTFormer are comparing it to the libraries listed below
Sorting:
- π Forging Spatial Intelligence: A Roadmap of Multi-Modal Data Pre-Training for Autonomous Systemsβ120Updated this week
- PySegMetrics (PSM): A Python-based Simple yet Efficient Evaluation Toolbox for Segmentation-like tasksβ123Updated last year
- β303Updated 2 months ago
- (IJCV 2024 & ACM MM 2021 Oral) Multi-Source Fusion and Automatic Predictor Selection for Zero-Shot Video Object Segmentationβ119Updated 3 years ago
- [NeurIPS 2025] More Than Generation: Unifying Generation and Depth Estimation via Text-to-Image Diffusion Modelsβ215Updated 2 months ago
- [NeurIPS 2025] NAUTILUS: A Large Multimodal Model for Underwater Scene Understandingβ347Updated 2 weeks ago
- β207Updated 7 months ago
- Inspiring the Next Generation of Segment Anything Models: Comprehensively Evaluate SAM and SAM 2 with Diverse Prompts Towards Context-Depβ¦β575Updated 4 months ago
- PyTorch implementation for "Unlearning the Noisy Correspondence Makes CLIP More Robust (ICCV 2025)"β68Updated 3 months ago
- Official Pytorch implementation for ICML 2025 paper "Large Continual Instruction Assistant"β66Updated last week
- The summary of code and paper for unified model towards context-dependent (CD) concept segmentation.β119Updated 4 months ago
- (TIP 2022) Joint Learning of Salient Object Detection, Depth Estimation and Contour Extractionβ109Updated 9 months ago
- The Collapse of Patchesβ58Updated last month
- (ECCV 2024) Open-Vocabulary Camouflaged Object Segmentationβ268Updated 4 months ago
- β247Updated 11 months ago
- β67Updated 4 months ago
- A curated collection of AI+X papers published in Nature / Science / Cell / Lancet / Radiology and their flagship sub-journalsβ136Updated 3 months ago
- β204Updated last week
- DeepThinkVLA: Enhancing Reasoning Capability of Vision-Language-Action Modelsβ466Updated 2 weeks ago
- DPO-Shift: Shifting the Distribution of Direct Preference Optimizationβ60Updated 9 months ago
- [Nature Communications 2025] Towards Expert-level Autonomous Carotid Ultrasonography with Large-scale Learning-based Robotic Systemβ277Updated 2 months ago
- a multiscale multimodal large language models for radiology report generation (RRG) tasksβ272Updated this week
- (CVPR 2024 & arXiv 2025) Power Battery Detectionβ310Updated 3 months ago
- Code for paper 'Borrowing Treasures from Neighbors: In-Context Learning for Multimodal Learning with Missing Modalities and Data Scarcityβ¦β92Updated last year
- [AAAI 2026 Oral] Cook and Clean Together: Teaching Embodied Agents for Parallel Task Executionβ356Updated 3 weeks ago
- Official repository of MMGenBenchβ120Updated 9 months ago
- β386Updated 5 months ago
- Intervening Anchor Token: Decoding Strategy in Alleviating Hallucinations for MLLMsβ164Updated 9 months ago
- [NeurIPS 2025 (D&B)] Rethinking Evaluation of Infrared Small Target Detectionβ341Updated 2 months ago
- (NeurIPS 2025) UniMRSeg: Unified Modality-Relax Segmentation via Hierarchical Self-Supervised Compensationβ174Updated last month