alwynpan / uom-comp90024
Demo Code for Subject COMP90024
☆10Updated last year
Related projects ⓘ
Alternatives and complementary repositories for uom-comp90024
- Project Description☆21Updated 6 months ago
- Teaching Material for COMP90086 - Computer Vision☆15Updated last year
- [ICML 2024] "Improving Accuracy-robustness Trade-off via Pixel Reweighted Adversarial Training"☆14Updated 5 months ago
- [ICML 2024] Visual-Text Cross Alignment: Refining the Similarity Score in Vision-Language Models☆11Updated 2 months ago
- In this fast-paced world, we all need a little something to spice up life. Whether you need a glass of sweet talk to lift your spirits or…☆8Updated this week
- Official implementation of ECCV 2024 paper: Take A Step Back: Rethinking the Two Stages in Visual Reasoning☆9Updated last month
- [ECCV2022] A PyTorch implementation of the paper "Spatial and Visual Perspective-Taking via View Rotation and Relation Reasoning for Embo…☆13Updated last year
- This repository is used for advertising PhD recruitment opportunities. Contributions are welcome!☆159Updated 2 months ago
- [CVPR2024] GSVA: Generalized Segmentation via Multimodal Large Language Models☆93Updated 2 months ago
- [AAAI 2024] Official implementation of NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models☆152Updated last year
- official implementation of "Interpreting CLIP's Image Representation via Text-Based Decomposition"☆164Updated 2 months ago
- We present a new method for long-tailed out-of-distribution detection☆12Updated 10 months ago
- The official repo for "SpatialBot: Precise Spatial Understanding with Vision Language Models.☆164Updated last month
- [NeurIPS 2023] Generalized Logit Adjustment☆34Updated 7 months ago
- Awesome paper for multi-modal llm with grounding ability☆11Updated 3 months ago
- [ICML 2024] "Visual-Text Cross Alignment: Refining the Similarity Score in Vision-Language Models"☆43Updated 2 months ago
- ☆16Updated 3 weeks ago
- A curated list of awesome papers on Embodied AI and related research/industry-driven resources.☆289Updated 3 months ago
- The paper collections for the autoregressive models in vision.☆229Updated this week
- ☆11Updated last month
- Codes of Paper "Learning 2D Invariant Affordance Knowledge for 3D Affordance Grounding"☆15Updated 2 months ago
- [NeurIPS 2023] The official implementation of SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation☆28Updated 8 months ago
- [EMNLP 2024 Oral] MatchTime: Towards Automatic Soccer Game Commentary Generation☆41Updated last month
- 具身智能中文指南☆409Updated this week
- AL-Ref-SAM 2: Unleashing the Temporal-Spatial Reasoning Capacity of GPT for Training-Free Audio and Language Referenced Video Object Segm…☆69Updated last month
- A Large Multimodal Model for Pixel-Level Visual Grounding in Videos☆34Updated 2 weeks ago
- Preview code of ECCV'24 paper "Distill Gold from Massive Ores" (BiLP)☆23Updated 4 months ago
- ☆75Updated 3 weeks ago
- [NeurIPS2024] Repo for the paper `ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models'☆96Updated last week