yfliu87 / DataMining_Capstone

Capstone project of UIUC DataMining course

☆8

Alternatives and similar repositories for DataMining_Capstone:

Users that are interested in DataMining_Capstone are comparing it to the libraries listed below

FatemehShiri / Spatial-MM
☆9Updated 2 months ago
Pengyue-Lab / uiuc-cs357-fa21-scripts
A repository of useful scripts for the course CS357 in the form of Jupyter Notebook.
☆12Updated 3 years ago
findalexli / mllm-dpo
[ACL 2024] Multi-modal preference alignment remedies regression of visual instruction tuning on language model
☆40Updated 4 months ago
hanmenghan / Skip-n
This repository contains the code of our paper 'Skip \n: A simple method to reduce hallucination in Large Vision-Language Models'.
☆13Updated last year
DanDoge / Palm
team Doggeee's solution to Ego4D LTA challenge@CVPRW23'
☆12Updated last year
tomchen-ctj / CVPR23-LOVEU-AQTC
【CVPRW'23】First Place Solution to the CVPR'2023 AQTC Challenge
☆15Updated last year
Share14 / ShareGemini
☆29Updated 8 months ago
lscpku / VITATECS
☆18Updated 8 months ago
archiki / RepARe
☆19Updated last year
zhaowei-wang-nlp / DivScene
The code of the paper "DivScene: Benchmarking LVLMs for Object Navigation with Diverse Scenes and Objects"
☆14Updated 5 months ago
lbaermann / qaego4d
Code and Dataset for the CVPRW Paper "Where did I leave my keys? — Episodic-Memory-Based Question Answering on Egocentric Videos"
☆23Updated last year
patrick-tssn / VideoHallucer
VideoHallucer, The first comprehensive benchmark for hallucination detection in large video-language models (LVLMs)
☆27Updated this week
szzexpoi / rex
Official Repository for CVPR 2022 paper "REX: Reasoning-aware and Grounded Explanation"
☆21Updated last year
eric-ai-lab / MMWorld
Official repo of the ICLR 2025 paper "MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos"
☆25Updated 6 months ago
ruili33 / TPO
☆29Updated 2 months ago
RenShuhuai-Andy / TESTA
[EMNLP 2023] TESTA: Temporal-Spatial Token Aggregation for Long-form Video-Language Understanding
☆49Updated last year
thecharm / BDoG
Code for ACM MM 2024 paper "A Picture Is Worth a Graph: A Blueprint Debate Paradigm for Multimodal Reasoning"
☆16Updated 3 months ago
shuheikurita / RefEgo
☆12Updated 8 months ago
showlab / MovieSeq
[ECCV 2024] Learning Video Context as Interleaved Multimodal Sequences
☆36Updated 3 weeks ago
Tanveer81 / RGNet
This is the official implementation of RGNet: A Unified Retrieval and Grounding Network for Long Videos
☆14Updated last month
rohan598 / ConTextual
☆25Updated 8 months ago
TencentARC / SEED-Bench-R1
☆45Updated this week
dmoltisanti / air-cvpr23
This repository contains the Adverbs in Recipes (AIR) dataset and the code published at the CVPR 23 paper: "Learning Action Changes by Me…
☆13Updated last year
caopulan / CVPR24_Listener
☆12Updated last year
zyayoung / Awesome-Video-LLMs
Explore VLM-Eval, a framework for evaluating Video Large Language Models, enhancing your video analysis with cutting-edge AI technology.
☆33Updated last year
VIStA-H / GPT-4V_Social_Media
GPT-4V(ision) as A Social Media Analysis Engine
☆35Updated 3 months ago
zjucsq / PLA
[ICLR2023] Video Scene Graph Generation from Single-Frame Weak Supervision
☆10Updated last year
CaraJ7 / MME-CoT
MME-CoT: Benchmarking Chain-of-Thought in LMMs for Reasoning Quality, Robustness, and Efficiency
☆92Updated this week
mu-cai / TemporalBench
TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models
☆29Updated 4 months ago
patrick-tssn / Awesome-Multimodal-Memory
Reading List of Memory Augmented Multimodal Research, including multimodal context modeling, memory in vision and robotics, and external …
☆13Updated 6 months ago