liyingxuan1012 / zeroshot-speaker-predictionView external linksLinks
Official repository of "Zero-Shot Character Identification and Speaker Prediction in Comics via Iterative Multimodal Fusion" (ACMMM 2024)
☆15Oct 31, 2024Updated last year
Alternatives and similar repositories for zeroshot-speaker-prediction
Users that are interested in zeroshot-speaker-prediction are comparing it to the libraries listed below
Sorting:
- Official repository of Manga109Dialog (ICME 2024)☆26Aug 3, 2024Updated last year
- Various annotations of Manga109 dataset☆13Apr 23, 2025Updated 9 months ago
- ☆11Jun 3, 2025Updated 8 months ago
- ☆14Apr 7, 2025Updated 10 months ago
- The code will come soon.☆15Sep 12, 2025Updated 5 months ago
- Anime Dataset Generator: Fetch, analyze, and utilize comprehensive anime data using the Jikan API.☆12Jul 28, 2023Updated 2 years ago
- ☆11May 15, 2025Updated 9 months ago
- ☆11Aug 29, 2025Updated 5 months ago
- 北邮课程设计与大作业合集☆11Mar 25, 2024Updated last year
- This repository implements computer vision for real-time chessboard detection and piece recognition. Using OpenCV and Numpy, the system p…☆12Sep 24, 2024Updated last year
- ☆11Sep 18, 2017Updated 8 years ago
- 技術書のサポートページです☆10Aug 21, 2020Updated 5 years ago
- Multilingual Entity Linking model by BELA model☆12Jul 20, 2023Updated 2 years ago
- ECCV2020_Spatial Hierarchy Aware Residual Pyramid Network for Time-of-Flight Depth Denoising☆12Sep 24, 2020Updated 5 years ago
- LVAS-Agent Code Base☆22Apr 15, 2025Updated 10 months ago
- 大连海事大学 信息学院课程攻略☆19Feb 27, 2022Updated 3 years ago
- Official implementation for the paper "Self-Play Reinforcement Learning for Fast Image Retargeting"☆10Oct 5, 2020Updated 5 years ago
- ☆10Oct 24, 2016Updated 9 years ago
- Accepted to ICLR 2025. MetaMetrics is a calibrated meta-metric designed to evaluate generation tasks across different modalities aligned …☆14Dec 30, 2024Updated last year
- [TKDD 2024] An official source code for paper Mixed Graph Contrastive Network for Semi-Supervised Node Classification.☆14Mar 6, 2025Updated 11 months ago
- [CVPR 2025] DocLayLLM: An Efficient Multi-modal Extension of Large Language Models for Text-rich Document Understanding☆26Dec 18, 2025Updated 2 months ago
- Twitter sentiment analysis of trending movies and songs.☆10May 2, 2021Updated 4 years ago
- Code and data for the paper: DTSM: Toward Dense Table Structure Recognition with Text Query Encoder and Adjacent Feature Aggregator☆12Apr 28, 2024Updated last year
- chainer v2 implementation of instance normalization☆11Aug 8, 2018Updated 7 years ago
- Hands-On Tutorial on Building Multimodal RAG Systems☆13Apr 10, 2025Updated 10 months ago
- [EMNLP 2024] TraveLER: A Modular Multi-LMM Agent Framework for Video Question-Answering☆16Oct 31, 2024Updated last year
- The official implement of CTRNet++.☆14Dec 30, 2024Updated last year
- thesis slides repository for cvpaper.challenge☆11Apr 25, 2019Updated 6 years ago
- ☆15Sep 30, 2023Updated 2 years ago
- My school projects.☆11Jan 10, 2024Updated 2 years ago
- Find XS-Leaks in the browser by diffing DOM-Graphs in two states☆18Jan 20, 2025Updated last year
- Theano☆11Aug 26, 2017Updated 8 years ago
- ☆15Mar 24, 2025Updated 10 months ago
- Demo scripts for Manga109☆12Nov 20, 2021Updated 4 years ago
- Source code of the TextLap model, a LLM for text-2-layout generation.☆17Oct 21, 2024Updated last year
- This repository accompaines the paper "Investigating Gender Fairness of Recommendation Algorithms in the Music Domain"☆15Jul 13, 2021Updated 4 years ago
- ☆15Feb 5, 2024Updated 2 years ago
- ☆12Feb 17, 2017Updated 9 years ago
- [ACM MM 2024] Pytorch Code for the paper "Robust Variational Contrastive Learning for Partially View-unaligned Clustering"☆14Feb 7, 2026Updated last week