lookwei / COMP4423
Course materials for COMP 4423 - Computer Vision for Beginners at the Hong Kong Polytechnic University
☆28Updated last year
Alternatives and similar repositories for COMP4423:
Users that are interested in COMP4423 are comparing it to the libraries listed below
- EmoLLM: Multimodal Emotional Understanding Meets Large Language Models☆14Updated 9 months ago
- Compositional Inversion for Stable Diffusion Models (AAAI 2024)☆35Updated last month
- ☆32Updated last year
- Code for ACM MM 2024 paper "A Picture Is Worth a Graph: A Blueprint Debate Paradigm for Multimodal Reasoning"☆16Updated 3 months ago
- My slides and examples for bachelor deep learning course☆11Updated 2 years ago
- Synth-Empathy: Towards High-Quality Synthetic Empathy Data☆15Updated last month
- This is the official implementation of 2024 CVPR paper "EmoGen: Emotional Image Content Generation with Text-to-Image Diffusion Models".☆78Updated 2 months ago
- [NeurIPS 2023] Act As You Wish: Fine-Grained Control of Motion Diffusion Model with Hierarchical Semantic Graphs☆126Updated last year
- ☆42Updated last year
- (ICLR'25) A Comprehensive Framework for Developing and Evaluating Multimodal Role-Playing Agents☆63Updated 2 months ago
- Personal PolyU COMP UG Subject Archive☆14Updated 2 months ago
- Explainable Multimodal Emotion Reasoning (EMER) and AffectGPT☆144Updated 11 months ago
- [ACM MM 2022] This is the official implementation of "Temporal Sentiment Localization: Listen and Look in Untrimmed Videos"☆16Updated last month
- ☆19Updated 2 years ago
- [AAAI 2024] DGL: Dynamic Global-Local Prompt Tuning for Text-Video Retrieval.☆39Updated 5 months ago
- TS-LLaVA: Constructing Visual Tokens through Thumbnail-and-Sampling for Training-Free Video Large Language Models☆14Updated 3 months ago
- [AAAI 2023] AVCAffe: A Large Scale Audio-Visual Dataset of Cognitive Load and Affect for Remote Work☆20Updated last year
- [ICCV 2023] DiffusionRet: Generative Text-Video Retrieval with Diffusion Model☆129Updated 11 months ago
- This repository compiles a list of papers related to Video LLM.☆20Updated 9 months ago
- Official Repo for National Industrial Software Congress 2023:"An Implementation of Multimodal Fusion System for Intelligent Digital Human…☆21Updated last year
- A programmatic instruction template generator aiming at enhancing the understanding of the critical role instruction templates play in la…☆16Updated 3 months ago
- Awsome works based on SSM and Mamba☆17Updated 11 months ago
- [IJCAI 2023] Text-Video Retrieval with Disentangled Conceptualization and Set-to-Set Alignment☆51Updated 11 months ago
- ICLR2024 statistics☆47Updated last year
- [CVPR 2024] EmoVIT: Revolutionizing Emotion Insights with Visual Instruction Tuning☆29Updated 6 months ago
- Code for "LocLLM: Exploiting Generalizable Human Keypoint Localization via Large Language Model", CVPR 2024 Highlight☆42Updated 9 months ago
- ☆42Updated 4 months ago
- ☆16Updated 5 months ago
- Using image captions with LLM for zero-shot VQA☆16Updated last year
- Aims for memory-efficient training (24GB VRAM) on consumer GPUs. Optimizing language models through guidance tokens in reasoning chains, …☆25Updated last month