PKU-ICST-MIPL / Finedefics_ICLR2025View external linksLinks
☆85Apr 21, 2025Updated 9 months ago
Alternatives and similar repositories for Finedefics_ICLR2025
Users that are interested in Finedefics_ICLR2025 are comparing it to the libraries listed below
Sorting:
- ☆11Jan 27, 2020Updated 6 years ago
- official repo for `thinking with images through-self-calling`☆20Dec 28, 2025Updated last month
- Official Repository of "SelEx: Self-Expertise in Fine-Grained Generalized Category Discovery" (ECCV 2024)☆31Aug 4, 2025Updated 6 months ago
- FuseLIP: Multimodal Embeddings via Early Fusion of Discrete Tokens☆17Sep 8, 2025Updated 5 months ago
- ☆12Feb 2, 2023Updated 3 years ago
- (ECCV2024) Textual Knowledge Matters: Cross-Modality Co-Teaching for Generalized Visual Class Discovery (TextGCD)☆21Nov 26, 2025Updated 2 months ago
- Transactions on Multimedia (TMM25)☆19Apr 8, 2025Updated 10 months ago
- ☆15Oct 27, 2023Updated 2 years ago
- LMM solved catastrophic forgetting, AAAI2025☆45Apr 15, 2025Updated 10 months ago
- [EMNLP 2024] Official repository for paper "From the Least to the Most: Building a Plug-and-Play Visual Reasoner via Data Synthesis"☆21Oct 15, 2024Updated last year
- A Simple Framework of Small-scale LMMs for Video Understanding☆108Jun 11, 2025Updated 8 months ago
- Official implementation of the paper "Bind-Your-Avatar: Multi-Talking-Character Video Generation with Dynamic 3D-mask-based Embedding Rou…☆34Sep 25, 2025Updated 4 months ago
- code for paper: Simultaneous Image to Zero and Zero to Noise: Diffusion Models with Analytical Image Attenuation☆60Jan 17, 2026Updated last month
- 👆Pytorch implementation of "Ctrl-V: Higher Fidelity Video Generation with Bounding-Box Controlled Object Motion"☆31Jul 28, 2025Updated 6 months ago
- [CVPR 2025] FLAIR: VLM with Fine-grained Language-informed Image Representations☆132Sep 1, 2025Updated 5 months ago
- ViCToR: Improving Visual Comprehension via Token Reconstruction for Pretraining LMMs☆28Aug 15, 2025Updated 6 months ago
- ☆34Apr 9, 2025Updated 10 months ago
- LLaVE: Large Language and Vision Embedding Models with Hardness-Weighted Contrastive Learning☆75May 23, 2025Updated 8 months ago
- Official implementation for "Diffusion Instruction Tuning"☆31Jun 10, 2025Updated 8 months ago
- FineCLIP: Self-distilled Region-based CLIP for Better Fine-grained Understanding (NIPS24)☆34Nov 12, 2025Updated 3 months ago
- A Data collector for self-driving using GTA5☆31Jul 24, 2017Updated 8 years ago
- Official implementation of TagAlign☆35Dec 11, 2024Updated last year
- ☆107Aug 14, 2025Updated 6 months ago
- Official PyTorch Implementation for the "RewardSDS: Aligning Score Distillation via Reward-Weighted Sampling" paper!☆12Jun 10, 2025Updated 8 months ago
- 知识产权管理系统(开源版)是由PHP、ThinkPHP、FastAdmin开发,MIT协议,赋能企业对企业的商标、专利和著作权的沉淀和管理,替代传统Excel方式。The Intellectual Property Management System (Open Sourc…☆15Aug 23, 2024Updated last year
- 本项目是基于coze-studio项目进行的二次开发,遵循其Apache 2.0 协议许可证。主要修改并使用其工作流部分的代码,作为联通元景万悟智能体平台的工作流模块。☆26Feb 10, 2026Updated last week
- Official GitHub repo for Learning Normal Flow Directly from Event Neighborhoods (ICCV2025). It is an easy-to-use API for event-based norm…☆18Oct 5, 2025Updated 4 months ago
- 基于uni-app和uview封装的crud组件,包括列表uvue-list、 表单uvue-form和表单项uvue-form-item组件☆11Aug 6, 2024Updated last year
- Public teaching materials for Reasoning and Agents☆12May 29, 2025Updated 8 months ago
- ☆17May 25, 2025Updated 8 months ago
- CVPR2025: Benchmarking Large Vision-Language Models via Directed Scene Graph for Comprehensive Image Captioning☆38Mar 21, 2025Updated 10 months ago
- Official repository of 'Visual-RFT: Visual Reinforcement Fine-Tuning' & 'Visual-ARFT: Visual Agentic Reinforcement Fine-Tuning'’☆2,319Oct 29, 2025Updated 3 months ago
- [NeurIPS 2025] More Thinking, Less Seeing? Assessing Amplified Hallucination in Multimodal Reasoning Models☆75May 31, 2025Updated 8 months ago
- Bambo is a new proxy framework. Compared with mainstream frameworks, it is more lightweight and flexible and can handle various load task…☆33Feb 10, 2025Updated last year
- [CVPR 2025] T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generation☆105Oct 25, 2025Updated 3 months ago
- [ICLR'24] Democratizing Fine-grained Visual Recognition with Large Language Models☆189Jul 15, 2024Updated last year
- SfMEdu System from Princeton for Dense 3D Reconstruction☆11Dec 11, 2019Updated 6 years ago
- CasTex: Cascaded Text-to-Texture Synthesis via Explicit Texture Maps and Physically-Based Shading☆33Jan 21, 2026Updated 3 weeks ago
- Codes and generated datasets for Paper "Multi-task deep learning for large-scale building detail extraction from high-resolution satellit…☆12Feb 19, 2024Updated last year