aassxun / Understanding-Vision-TasksLinks
☆206Updated 5 months ago
Alternatives and similar repositories for Understanding-Vision-Tasks
Users that are interested in Understanding-Vision-Tasks are comparing it to the libraries listed below
Sorting:
- Official Pytorch implementation for ICML 2025 paper "Large Continual Instruction Assistant"☆64Updated 3 months ago
- A pytorch implementation of the paper "TreeLoRA: Efficient Continual Learning via Layer-Wise LoRAs Guided by a Hierarchical Gradient-Simi…☆343Updated last month
- ☆186Updated 2 weeks ago
- ☆197Updated last month
- The code for TPAMI paper "Text-Guided Human Image Manipulation via Image-Text Shared Space"☆86Updated 3 years ago
- Dataset and evaluation code of ISDrama(ACM-MM 2025): Immersive Spatial Drama Generation through Multimodal Prompting☆234Updated 2 months ago
- DPO-Shift: Shifting the Distribution of Direct Preference Optimization☆60Updated 8 months ago
- ☆67Updated 3 months ago
- [CVPR 2025 Highlight] Official Implementation of SURGEON: Memory-Adaptive Fully Test-Time Adaptation via Dynamic Activation Sparsity☆112Updated 5 months ago
- [NeurIPS 2025] More Than Generation: Unifying Generation and Depth Estimation via Text-to-Image Diffusion Models☆202Updated 2 weeks ago
- This is a pytorch project for the paper Universal Adaptive Data Augmentation (IJCAI2023).☆86Updated 3 months ago
- This is the pytorch implementation for AAAI2022 paper "Hierarchical Image Generation via Transformer-Based Sequential Patch Selection"☆84Updated 3 years ago
- (TIP 2022) Joint Learning of Salient Object Detection, Depth Estimation and Contour Extraction☆109Updated 7 months ago
- Main Project of AIDE☆91Updated 9 months ago
- The project for General Adversarial Defense Against Black-box Attacks via Pixel Level and Feature Level Distribution Alignments.☆81Updated 2 years ago
- ☆50Updated 7 months ago
- Repository for the paper:☆69Updated last year
- ☆233Updated 5 months ago
- ☆121Updated 4 months ago
- [ACL 2025] FinMME: Benchmark Dataset for Financial Multi-Modal Reasoning Evaluation☆58Updated 5 months ago
- a multiscale multimodal large language models for radiology report generation (RRG) tasks☆267Updated 3 months ago
- ☆393Updated 6 months ago
- ☆104Updated last month
- ☆42Updated 9 months ago
- Using KAG and RAG Approaches to Enhance an AI-Powered Cryptocurrency Trading Agent☆28Updated 9 months ago
- Bingo is a desktop application designed specifically for ad developers. It helps you quickly build, test, and publish cross-platform play…☆83Updated 2 months ago
- ☆80Updated 3 months ago
- An Integrated Library for Tuning, Deploying and Interpreting Genomic Models☆118Updated last month
- [MM 2024] Official code for VeCAF: Vision-language Collaborative Active Finetuning with Training Objective Awareness☆52Updated last year
- An interactive React 18 portfolio featuring AI-powered career assistance, dynamic project showcases with live previews, smooth Framer Mot…☆89Updated 2 months ago