MSiam / PixFoundationLinks

☆10

Alternatives and similar repositories for PixFoundation

Users that are interested in PixFoundation are comparing it to the libraries listed below

Sorting:

tripletclip / TripletCLIP
[NeurIPS 2024] Official PyTorch implementation of "Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives"
☆41Updated 7 months ago
rui-qian / READ
Rui Qian, Xin Yin, Dejing Dou†: Reasoning to Attend: Try to Understand How <SEG> Token Works (CVPR 2025)
☆38Updated 2 months ago
tian1327 / SWAT
[CVPR 2025] Few-shot Recognition via Stage-Wise Retrieval-Augmented Finetuning
☆20Updated 3 weeks ago
locuslab / llava-token-compression
☆42Updated 8 months ago
wjpoom / SPEC
[CVPR 2024] The official implementation of paper "synthesize, diagnose, and optimize: towards fine-grained vision-language understanding"
☆43Updated last month
hammoudhasan / DiffCLIP
Official Implementation of DiffCLIP: Differential Attention Meets CLIP
☆36Updated 4 months ago
mlvlab / RALF
Official implementation of CVPR 2024 paper "Retrieval-Augmented Open-Vocabulary Object Detection".
☆41Updated 10 months ago
yuecao0119 / MMFuser
The official implementation of the paper "MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding". …
☆56Updated 8 months ago
syp2ysy / prompt-SelF
[TIP] Exploring Effective Factors for Improving Visual In-Context Learning
☆19Updated 2 weeks ago
alipay / POA
☆21Updated 11 months ago
GasolSun36 / MVP
Look, Compare, Decide: Alleviating Hallucination in Large Vision-Language Models via Multi-View Multi-Path Reasoning
☆22Updated 10 months ago
jonkahana / CLIPPR
An official PyTorch implementation for CLIPPR
☆29Updated last year
LunarShen / DsicoVLA
[CVPR 2025] DiscoVLA: Discrepancy Reduction in Vision, Language, and Alignment for Parameter-Efficient Video-Text Retrieval
☆17Updated 3 weeks ago
HashmatShadab / MambaRobustness
[CVPRW 2025] Official repository of paper titled "Towards Evaluating the Robustness of Visual State Space Models"
☆24Updated last month
YBZh / OpenOOD-VLM
ECCV24, NeurIPS24, Benchmarking Generalized Out-of-Distribution Detection with Vision-Language Models
☆26Updated 6 months ago
dogehhh / ReCLIP
Pytorch Implementation for CVPR 2024 paper: Learn to Rectify the Bias of CLIP for Unsupervised Semantic Segmentation
☆47Updated last month
iancovert / locality-alignment
☆50Updated 6 months ago
ytaek-oh / vl_compo
☆10Updated last year
FreedomIntelligence / TRIM
We introduce new approach, Token Reduction using CLIP Metric (TRIM), aimed at improving the efficiency of MLLMs without sacrificing their…
☆15Updated 7 months ago
bardisafa / PreSel
[CVPR 2025] An Implementation of the paper "Pre-Instruction Data Selection for Visual Instruction Tuning"
☆12Updated last month
si0wang / ViCrit
☆18Updated 3 weeks ago
OpenGVLab / De-focus-Attention-Networks
Learning 1D Causal Visual Representation with De-focus Attention Networks
☆35Updated last year
CVMI-Lab / clip-beyond-tail
(NeurIPS 2024) What Makes CLIP More Robust to Long-Tailed Pre-Training Data? A Controlled Study for Transferable Insights
☆27Updated 8 months ago
wangf3014 / Adventurer
☆22Updated 4 months ago
MengLcool / SEGIC
[ECCV-24] This is the official implementation of the paper "SEGIC: Unleashing the Emergent Correspondence for In-Context Segmentation".
☆24Updated 9 months ago
tydpan / OpenPartSeg
☆14Updated 2 years ago
THU-MIG / VTC-CLS
official repo for paper "[CLS] Token Tells Everything Needed for Training-free Efficient MLLMs"
☆22Updated 2 months ago
ethanlshen / HierNet
Code for "Are “Hierarchical” Visual Representations Hierarchical?" in NeurIPS Workshop for Symmetry and Geometry in Neural Representation…
☆21Updated last year
ChangyaoTian / ADDP
The official implementation of ADDP (ICLR 2024)
☆12Updated last year
sterzhang / PVIT
Official Repository of Personalized Visual Instruct Tuning
☆31Updated 4 months ago