shuheikurita/RefEgo

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/shuheikurita/RefEgo)

shuheikurita / RefEgo

☆13

Alternatives and similar repositories for RefEgo

Users that are interested in RefEgo are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

amitakamath / vl_text_encoders_are_bottlenecks
View on GitHub
Code and datasets for "Text encoders are performance bottlenecks in contrastive vision-language models". Coming soon!
☆11May 24, 2023Updated 3 years ago
SMILE-data / SMILE
View on GitHub
SMILE: A Multimodal Dataset for Understanding Laughter
☆13Jun 15, 2023Updated 3 years ago
lxa9867 / QSD
View on GitHub
[CVPR 2024] "Towards Robust Audiovisual Segmentation in Complex Environments with Quantization-based Semantic Decomposition"
☆12Feb 27, 2024Updated 2 years ago
qumengxue / RIO
View on GitHub
☆13Oct 30, 2023Updated 2 years ago
florianHofherr / PhysParamInference
View on GitHub
☆19Jan 30, 2023Updated 3 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
lezhang7 / Enhance-FineGrained
View on GitHub
[CVPR 2024] Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Fine-grained Understanding
☆56Apr 7, 2025Updated last year
ilkerkesen / ViLMA
View on GitHub
ViLMA: A Zero-Shot Benchmark for Linguistic and Temporal Grounding in Video-Language Models (ICLR 2024, Official Implementation)
☆16Jan 18, 2024Updated 2 years ago
lscpku / VITATECS
View on GitHub
☆18Jul 10, 2024Updated 2 years ago
danielchyeh / this-is-my
View on GitHub
Official This-Is-My Dataset published in CVPR 2023
☆16Jul 18, 2024Updated 2 years ago
ChangyaoTian / ADDP
View on GitHub
The official implementation of ADDP (ICLR 2024)
☆12Mar 27, 2024Updated 2 years ago
pipinstallyp / minigpt4-batch
View on GitHub
Use miniGPT-4 batch to generate captions for a lot of images! You should be able to create the best captions you always wanted!
☆18Jul 20, 2023Updated 3 years ago
ethanlshen / HierNet
View on GitHub
Code for "Are “Hierarchical” Visual Representations Hierarchical?" in NeurIPS Workshop for Symmetry and Geometry in Neural Representation…
☆23Nov 8, 2023Updated 2 years ago
wtjiang98 / PoseTrans
View on GitHub
code of "PoseTrans: A Simple Yet Effective Pose Transformation Augmentation for Human Pose Estimation" (ECCV 2022)
☆26Feb 20, 2023Updated 3 years ago
bo-miao / HTR
View on GitHub
[TCSVT 2024] Temporally Consistent Referring Video Object Segmentation with Hybrid Memory
☆19Apr 9, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
tomchen-ctj / OST
View on GitHub
【CVPR'24】OST: Refining Text Knowledge with Optimal Spatio-Temporal Descriptor for General Video Recognition
☆39Apr 27, 2024Updated 2 years ago
tandav / pitch-detectors
View on GitHub
collection of pitch (f0, fundamental frequency) detection algorithms with unified interface
☆25Nov 25, 2024Updated last year
aszala / VPEval
View on GitHub
VPEval Codebase from Visual Programming for Text-to-Image Generation and Evaluation (NeurIPS 2023)
☆45Nov 29, 2023Updated 2 years ago
jmiemirza / ActMAD
View on GitHub
ActMAD: Activation Matching to Align Distributions for Test-Time-Training (CVPR 2023)
☆21Jun 27, 2023Updated 3 years ago
arijitray1993 / COLA
View on GitHub
COLA: Evaluate how well your vision-language model can Compose Objects Localized with Attributes!
☆25May 14, 2026Updated 2 months ago
WikiChao / Ego-AV-Loc
View on GitHub
[CVPR 2023] Egocentric Audio-Visual Object Localization
☆27Jan 6, 2024Updated 2 years ago
asudahkzj / Wnet
View on GitHub
Wnet: Audio-Guided Video Object Segmentation via Wavelet-Based Cross-Modal Denoising Networks
☆24Sep 6, 2022Updated 3 years ago
HengLan / VastTrack
View on GitHub
[NeurIPS 2024] VastTrack: Vast Category Visual Object Tracking
☆77Sep 30, 2025Updated 9 months ago
weijiawu / Awesome-Synthetic-Data-for-Perception-Task
View on GitHub
☆45Jun 1, 2023Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
GeWu-Lab / APPO
View on GitHub
The official repository for CVPR'26 Paper "APPO: Attention-guided Perception Policy Optimization for Video Reasoning"
☆17Mar 19, 2026Updated 4 months ago
wanglu-cs / Think_While_Watching
View on GitHub
☆19Jun 26, 2026Updated last month
SAP / software-documentation-data-set-for-machine-translation
View on GitHub
A parallel evaluation data set of SAP software documentation with document structure annotation
☆15Jun 12, 2026Updated last month
KarlesZheng / FERMT
View on GitHub
☆13Jul 15, 2024Updated 2 years ago
MCG-NJU / CaReBench
View on GitHub
A Fine-grained Benchmark for Video Captioning and Retrieval
☆30Jul 16, 2025Updated last year
AngelosNal / Vision-DiffMask
View on GitHub
Official PyTorch implementation of Vision DiffMask, a post-hoc interpretation method for vision models.
☆32Mar 5, 2024Updated 2 years ago
thunlp / Migician
View on GitHub
[ACL2025 Findings] Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models
☆90May 20, 2025Updated last year
yoxu515 / MITS
View on GitHub
☆21Jul 25, 2024Updated 2 years ago
Chuhanxx / helping_hand_for_egocentric_videos
View on GitHub
Implementation of paper 'Helping Hands: An Object-Aware Ego-Centric Video Recognition Model'
☆33Nov 7, 2023Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
jiahaolu97 / anything-unsegmentable
View on GitHub
(CVPR 2024) "Unsegment Anything by Simulating Deformation"
☆29May 27, 2024Updated 2 years ago
JackWhite-rwx / SceneGraphGenZeroShotWithGSAM
View on GitHub
Scene Graph Generate Zero Shot
☆23Apr 16, 2023Updated 3 years ago
dmoltisanti / air-cvpr23
View on GitHub
This repository contains the Adverbs in Recipes (AIR) dataset and the code published at the CVPR 23 paper: "Learning Action Changes by Me…
☆13May 25, 2023Updated 3 years ago
ZhangDailing8 / CPDTrack
View on GitHub
☆18Feb 8, 2026Updated 5 months ago
ltgoslo / factorizer
View on GitHub
☆16May 14, 2024Updated 2 years ago
princetonvisualai / directional-bias-amp
View on GitHub
https://arxiv.org/abs/2102.12594
☆14Oct 3, 2023Updated 2 years ago
IFICL / SLfM
View on GitHub
Official code for the paper: [ICCV2023] Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation
☆43Jul 16, 2026Updated last week