dieuroi / SimAestheticsLinks

☆10

Alternatives and similar repositories for SimAesthetics

Users that are interested in SimAesthetics are comparing it to the libraries listed below

Sorting:

dieuroi / Awesome-Image-Aesthetic-Assessment
This is a list of resources that utilize machine learning technologies to solve image aesthetic assessment.
☆45Updated 11 months ago
CuthbertCai / Ask-Confirm
Ask&Confirm: Active Detail Enriching for Cross-Modal Retrieval with Partial Query (ICCV2021)
☆20Updated 3 years ago
TencentYoutuResearch / HighlightDetection-CLC
Code for CVPR2023 paper "Collaborative Noisy Label Cleaner: Learning Scene-aware Trailers for Multi-modal Highlight Detection in Movies"
☆17Updated 2 years ago
wenz116 / DRFT
End-to-end Multi-modal Video Temporal Grounding, NeurIPS 2021
☆18Updated 3 years ago
frostinassiky / bsp
Placeholder for code of BSP.
☆11Updated 3 years ago
kaipengfang / ProS
☆20Updated 10 months ago
FeiElysia / awesome-zero-shot-captioning
A curated list of zero-shot captioning papers
☆22Updated last year
ZhenZHAO / awesome-video-moment-retrieval
paper list on Video Moment Retrieval (VMR), or Natural Language Video Localization (NLVL), or Temporal Sentence Grounding in Videos (TSGV…
☆31Updated 2 years ago
Lookuz / VidHal
Codebase for VidHal: Benchmarking Hallucinations in Vision LLMs
☆12Updated last month
huanranchen / Visualize-Loss-Landscape
Respect to the input tensor instead of paramters of NN
☆19Updated 2 years ago
TencentARC / TVTS
Turning to Video for Transcript Sorting
☆48Updated last year
mengcaopku / DCNet
[ACM MM 22] Correspondence Matters for Video Referring Expression Comprehension
☆15Updated 2 years ago
renjie-liang / HUAL
Are Binary Annotations Sufficient? Video Moment Retrieval via Hierarchical Uncertainty-based Active Learning
☆14Updated last year
showlab / MovieSeq
[ECCV 2024] Learning Video Context as Interleaved Multimodal Sequences
☆39Updated 2 months ago
GasolSun36 / MVP
Look, Compare, Decide: Alleviating Hallucination in Large Vision-Language Models via Multi-View Multi-Path Reasoning
☆22Updated 8 months ago
dhg-wei / TOPA
(NeurIPS 2024 Spotlight) TOPA: Extend Large Language Models for Video Understanding via Text-Only Pre-Alignment
☆31Updated 8 months ago
wengzejia1 / Open-VCLIP
☆115Updated last year
GenjiB / ECLIPSE
☆32Updated 2 years ago
showlab / Region_Learner
The Pytorch implementation for "Video-Text Pre-training with Learned Regions"
☆42Updated 2 years ago
mengcaopku / LocVTP
[ECCV 22] LocVTP: Video-Text Pre-training for Temporal Localization
☆39Updated 2 years ago
NeverMoreLCH / Awesome-Video-Grounding
A reading list of papers about Visual Grounding.
☆31Updated 2 years ago
sheng-eatamath / S3A
repo for paper titled: Towards Realistic Zero-Shot Classification via Self Structural Semantic Alignment (AAAI'24 Oral)
☆25Updated last year
NNNNAI / Ego4d_NLQ_2022_1st_Place_Solution
The 1st place solution of 2022 Ego4d Natural Language Queries.
☆32Updated 2 years ago
dhg-wei / MCL
(ICML 2024) Improve Context Understanding in Multimodal Large Language Models via Multimodal Composition Learning
☆27Updated 8 months ago
solicucu / D3G
☆14Updated last year
Jack-lx-jiang / VBAD
Black-box Adversarial Attacks on Video Recognition Models. (VBAD)
☆26Updated 5 years ago
tzhhhh123 / HC-STVG
The HC-STVG Dataset
☆55Updated 2 years ago
csbobby / STAR_Benchmark
☆32Updated last year
Ruiyang-061X / Awesome-MLLM-Uncertainty
✨A curated list of papers on the uncertainty in multi-modal large language model (MLLM).
☆45Updated 2 months ago
Lihr747 / CgtGAN
☆17Updated last month