lookwei / COMP4423
Course materials for COMP 4423 - Computer Vision for Beginners at the Hong Kong Polytechnic University
☆28Updated last year
Alternatives and similar repositories for COMP4423:
Users that are interested in COMP4423 are comparing it to the libraries listed below
- ☆104Updated last month
- EmoLLM: Multimodal Emotional Understanding Meets Large Language Models☆14Updated 8 months ago
- ☆50Updated 4 months ago
- Let the IELTS-prompted GPTs accompany you through IELTS mock exams, help you with scoring and provide suggestions for improvement.☆17Updated 10 months ago
- Code for AAAI24 paper Text-Guided Molecule Generation with Diffusion Language Model☆23Updated 6 months ago
- The official repository for the Scientific Paper Idea Proposer (SciPIP)☆60Updated this week
- [ACL 2024 Best Paper] Deciphering Oracle Bone Language with Diffusion Models☆95Updated 3 weeks ago
- Repository for Text2Mol: Cross-Modal Molecular Retrieval with Natural Language Queries☆42Updated last year
- Narrative movie understanding benchmark☆66Updated 9 months ago
- ☆18Updated 2 years ago
- My slides and examples for bachelor deep learning course☆11Updated 2 years ago
- [ACL 2024] Logical Closed Loop: Uncovering Object Hallucinations in Large Vision-Language Models. Detect and mitigate object hallucinatio…☆20Updated last month
- Official Repo for FoodieQA paper (EMNLP 2024)☆15Updated 3 months ago
- Compositional Inversion for Stable Diffusion Models (AAAI 2024)☆35Updated this week
- ☆42Updated 3 months ago
- ☆28Updated last year
- [Paper][AAAI2024]Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-modal Structured Representations☆131Updated 8 months ago
- Multimodal Empathetic Chatbot☆31Updated 7 months ago
- (ICLR'25) A Comprehensive Framework for Developing and Evaluating Multimodal Role-Playing Agents☆45Updated last month
- Using image captions with LLM for zero-shot VQA☆15Updated 11 months ago
- [IJCAI 2023] Text-Video Retrieval with Disentangled Conceptualization and Set-to-Set Alignment☆51Updated 10 months ago
- [ICLR 2024] Contextualized Diffusion Models for Text-Guided Image and Video Generation☆64Updated 9 months ago
- 😎 基于知识的文本生成相关文章总结与个人笔记☆21Updated 4 months ago
- This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have b…☆71Updated last year
- David's PolyU Works☆26Updated 3 months ago
- Video Chain of Thought, Codes for ICML 2024 paper: "Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition"☆100Updated last week
- Code for ACM MM 2024 paper "A Picture Is Worth a Graph: A Blueprint Debate Paradigm for Multimodal Reasoning"☆13Updated 2 months ago
- [ICCV 2023] DiffusionRet: Generative Text-Video Retrieval with Diffusion Model☆126Updated 10 months ago