A framework for unified personalized model, achieving mutual enhancement between personalized understanding and generation. Demonstrating the potential of cross-task information transfer in personalized scenario, paving the way for the development of general unified models.
☆128Dec 25, 2025Updated 2 months ago
Alternatives and similar repositories for UniCTokens
Users that are interested in UniCTokens are comparing it to the libraries listed below
Sorting:
- Official implementation of MC-LLaVA.☆140Nov 10, 2025Updated 4 months ago
- Official code of MoSA (Mixture of Sparse Adapters).☆13Dec 14, 2023Updated 2 years ago
- ☆33Feb 15, 2026Updated 3 weeks ago
- XJTU 2023 Fall OS Labs and Reports by Hypocrisy☆12Oct 16, 2024Updated last year
- [ECCV2024]FALIP: Visual Prompt as Foveal Attention Boosts CLIP Zero-Shot Performance☆17Sep 11, 2024Updated last year
- VideoMathQA is a benchmark designed to evaluate mathematical reasoning in real-world educational videos☆22Jan 26, 2026Updated last month
- [ICLR2025] Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want☆95Dec 1, 2025Updated 3 months ago
- handy tools for user study☆21May 21, 2024Updated last year
- Official repository for the UAE paper, unified-GRPO, and unified-Bench☆158Sep 12, 2025Updated 5 months ago
- [TMC 2025/NOSSDAV 2023] Official code for RepCaM++ and RepCaM: Re-parameterization Content-aware Modulation for Neural Video Delivery☆54Apr 21, 2025Updated 10 months ago
- [Paper] SoMoFormer: Multi-Person Pose Forecasting with Transformers☆27Mar 1, 2023Updated 3 years ago
- Training Autoregressive Image Generation models via Reinforcement Learning☆50Nov 26, 2025Updated 3 months ago
- VCR-Bench: A Comprehensive Evaluation Framework for Video Chain-of-Thought Reasoning☆35Jul 15, 2025Updated 7 months ago
- [AAAI 2024] Official code for Efficient Deweather Mixture-of-Experts with Uncertainty-aware Feature-wise Linear Modulation☆62Jan 26, 2025Updated last year
- This is the official repository for the paper "Multimodal-Conditioned Latent Diffusion Models for Fashion Image Editing".☆30Mar 28, 2024Updated last year
- A collection of awesome think with videos papers.☆91Dec 1, 2025Updated 3 months ago
- ☆98Jun 23, 2025Updated 8 months ago
- [NeurIPS'24] Official implementation of paper "Unveiling the Tapestry of Consistency in Large Vision-Language Models".☆38Oct 23, 2024Updated last year
- [ICLR2025] MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models☆95Sep 14, 2024Updated last year
- (CVPR 2026) Long-RVOS: A Comprehensive Benchmark for Long-term Referring Video Object Segmentation☆27Feb 28, 2026Updated last week
- [MM 2024] Official code for VeCAF: Vision-language Collaborative Active Finetuning with Training Objective Awareness☆52Jul 24, 2024Updated last year
- Code for the AAAI 2024 paper: "AGS: Affordable and Generalizable Substitute Training for Transferable Adversarial Attack" (accepted).☆12Mar 28, 2024Updated last year
- ☆10Oct 13, 2024Updated last year
- code for LSN☆10Oct 28, 2024Updated last year
- ☆12Feb 7, 2018Updated 8 years ago
- Debiasing Through Data Attribution☆12May 23, 2024Updated last year
- ☆23Dec 11, 2025Updated 3 months ago
- [CVPR 2025] Silence is Golden: Leveraging Adversarial Examples to Nullify Audio Control in LDM-based Talking-Head Generation☆19Dec 18, 2025Updated 2 months ago
- [ICCV 2025] Official code for paper: Beyond Text-Visual Attention: Exploiting Visual Cues for Effective Token Pruning in VLMs☆70Jul 1, 2025Updated 8 months ago
- ☆89Dec 12, 2025Updated 2 months ago
- DataFlex is a data-centric training framework that enhances model performance by either selecting the most influential samples, optimizin…☆113Feb 4, 2026Updated last month
- [NeurIPS 2024] Code for Dual Prototype Evolving for Test-Time Generalization of Vision-Language Models☆47Mar 14, 2025Updated 11 months ago
- [ECCV 2024] "REVISION: Rendering Tools Enable Spatial Fidelity in Vision-Language Models"☆13Aug 6, 2024Updated last year
- ☆11Sep 28, 2023Updated 2 years ago
- The official pytorch implemention of our IJCV-2025 paper "Learning with Enriched Inductive Biases for Vision-Language Models".☆14Mar 26, 2025Updated 11 months ago
- Official code for "Rethinking Chain-of-Thought Reasoning for Videos"☆20Dec 14, 2025Updated 2 months ago
- Optimizable stack of images at different resolutions, a useful representation of images for deep learning tasks. Docs: https://johnowhita…☆11Sep 8, 2022Updated 3 years ago
- ☆13Apr 19, 2024Updated last year
- Awesome latest models, datasets and benchmarks on streaming/online video understanding.☆23Oct 19, 2025Updated 4 months ago