Gahyeonkim09 / AAPLView external linksLinks
AAPL: Adding Attributes to Prompt Learning for Vision-Language Models (CVPRw 2024)
☆34May 8, 2024Updated last year
Alternatives and similar repositories for AAPL
Users that are interested in AAPL are comparing it to the libraries listed below
Sorting:
- [Pattern Recognition 2024] Semantic-Aware Frame-Event Fusion based Pattern Recognition via Large Vision-Language Models, Dong Li, Jiandon…☆18Jan 18, 2025Updated last year
- Official PyTorch implementation for "MMS-LLaMA: Efficient LLM-based Audio-Visual Speech Recognition with Minimal Multimodal Speech Tokens…☆45Jun 12, 2025Updated 8 months ago
- Code for this paper "HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts via HyperNetwork"☆33Nov 29, 2023Updated 2 years ago
- PyTorch Implementation of "ASTRA: An Action Spotting TRAnsformer for Soccer Videos", ACM MMSports 2023. | 3rd place solution for SoccerNe…☆41May 20, 2024Updated last year
- ☆21Nov 9, 2025Updated 3 months ago
- This repository is the project page for "Point Anywhere: Directed Object Estimation from Omnidirectional Images", including source code …☆12Aug 25, 2023Updated 2 years ago
- ☆25Sep 19, 2023Updated 2 years ago
- Code associated with the EMNLP 2024 Main paper: "Image, tell me your story!" Predicting the original meta-context of visual misinformatio…☆45Dec 6, 2025Updated 2 months ago
- Multiple Transformation Function Estimation for Image Enhancement☆22Oct 20, 2024Updated last year
- multimodal change detection☆46Sep 20, 2024Updated last year
- A simple python package to stretch audio files and change their speed☆12Jan 16, 2026Updated 3 weeks ago
- The official implementation of Diffusion Distillation With Direct Preference Optimization For Efficient 3D LiDAR Scene Completion [AAAI'2…☆15Feb 2, 2026Updated last week
- Affordance-Aware Object Insertion via Mask-Aware Dual Diffusion☆47Feb 21, 2025Updated 11 months ago
- Wave Partial Differential Equation Solver in Python☆14Jun 5, 2024Updated last year
- Browser automation for creating new pages in WordPress☆13Jun 7, 2025Updated 8 months ago
- Generalizing from SIMPLE to HARD Visual Reasoning: Can We Mitigate Modality Imbalance in VLMs?☆15Jun 3, 2025Updated 8 months ago
- Code of the paper "Efficient Object Detection in Autonomous Driving using Spiking Neural Networks: Performance, Energy Consumption Analys…☆27Dec 13, 2023Updated 2 years ago
- [ACL2025] Unsolvable Problem Detection: Robust Understanding Evaluation for Large Multimodal Models☆80May 29, 2025Updated 8 months ago
- How Good is Google Bard's Visual Understanding? An Empirical Study on Open Challenges☆30Sep 24, 2023Updated 2 years ago
- ☆16May 13, 2025Updated 9 months ago
- [ICLR'25] Official repository of paper: Ranking-aware adapter for text-driven image ordering with CLIP☆16Apr 17, 2025Updated 9 months ago
- This is the repository for "SELECT: A Large-Scale Benchmark of Data Curation Strategies for Image Recognition"☆16Oct 8, 2024Updated last year
- [ECAI 2023] MonoSKD: General Distillation Framework for Monocular 3D Object Detection via Spearman Correlation Coefficient☆31Dec 8, 2023Updated 2 years ago
- Backtracing: Retrieving the Cause of the Query, EACL 2024 Long Paper, Findings.☆92Jul 21, 2024Updated last year
- A simple pytorch implementation of baseline based-on CLIP for Image-text Matching.☆18May 25, 2023Updated 2 years ago
- [TACL] Do Vision and Language Models Share Concepts? A Vector Space Alignment Study☆16Nov 22, 2024Updated last year
- ☆37Feb 6, 2023Updated 3 years ago
- VFM-Det: Towards High-Performance Vehicle Detection via Large Foundation Models☆37Apr 9, 2025Updated 10 months ago
- DALI Multi Agent System Framework☆42Jan 30, 2026Updated 2 weeks ago
- ☆17Aug 1, 2024Updated last year
- Official code release for the paper Trapped in texture bias? A large scale comparison of deep instance segmentation, accepted at ECCV 202…☆16Jan 16, 2024Updated 2 years ago
- [EMNLP 2024] Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality☆21Oct 8, 2024Updated last year
- Official implementation of "Gemini in Reasoning: Unveiling Commonsense in Multimodal Large Language Models"☆37Jan 3, 2024Updated 2 years ago
- [MICCAI2023 Early Accept] Multi-scale Cross-restoration Framework for Electrocardiogram Anomaly Detection☆62Nov 15, 2024Updated last year
- Easy control for Key-Value Constrained Generative LLM Inference(https://arxiv.org/abs/2402.06262)☆63Feb 13, 2024Updated 2 years ago
- 2D road segmentation using lidar data during training☆43Dec 21, 2023Updated 2 years ago
- Official implementation of MetaTree: Learning a Decision Tree Algorithm with Transformers☆114Sep 13, 2024Updated last year
- official implementation of "CLIP-VQDiffusion : Langauge Free Training of Text To Image generation using CLIP and vector quantized diffusi…☆18Sep 5, 2024Updated last year
- Code for reproducing our paper "Low Rank Adapting Models for Sparse Autoencoder Features"☆17Mar 31, 2025Updated 10 months ago