Official repository of "Zero-Shot Character Identification and Speaker Prediction in Comics via Iterative Multimodal Fusion" (ACMMM 2024)
☆16Oct 31, 2024Updated last year
Alternatives and similar repositories for zeroshot-speaker-prediction
Users that are interested in zeroshot-speaker-prediction are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Various annotations of Manga109 dataset☆13Apr 23, 2025Updated last year
- ☆11Jun 3, 2025Updated last year
- The code will come soon.☆16Sep 12, 2025Updated 9 months ago
- LVAS-Agent Code Base☆21Apr 15, 2025Updated last year
- ECCV2020_Spatial Hierarchy Aware Residual Pyramid Network for Time-of-Flight Depth Denoising☆12Sep 24, 2020Updated 5 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Code and data for the paper: DTSM: Toward Dense Table Structure Recognition with Text Query Encoder and Adjacent Feature Aggregator☆13Apr 28, 2024Updated 2 years ago
- This repository implements computer vision for real-time chessboard detection and piece recognition. Using OpenCV and Numpy, the system p…☆15Sep 24, 2024Updated last year
- [COLM 2025] "C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing"☆21Apr 9, 2025Updated last year
- ☆13Aug 29, 2025Updated 10 months ago
- [TKDD 2024] An official source code for paper Mixed Graph Contrastive Network for Semi-Supervised Node Classification.☆14Mar 6, 2025Updated last year
- The official implement of CTRNet++.☆15Dec 30, 2024Updated last year
- chainer v2 implementation of instance normalization☆11Aug 8, 2018Updated 7 years ago
- ☆16Feb 5, 2024Updated 2 years ago
- [ACM MM 2025] DA-Font: Few-Shot Font Generation via Dual-Attention Hybrid Integration☆52Mar 18, 2026Updated 3 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆11Sep 18, 2017Updated 8 years ago
- Introduce a novel Video Trimming (VT) task and proposes an agent-based approach (AVT) for detecting wasted footage, selecting valuable se…☆26Jan 20, 2025Updated last year
- Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization☆28Apr 14, 2025Updated last year
- mindmap: Spatial Memory in Deep Feature Maps for 3D Action Policies☆52Oct 16, 2025Updated 8 months ago
- ☆19Sep 11, 2024Updated last year
- [CVPR 2025] DocLayLLM: An Efficient Multi-modal Extension of Large Language Models for Text-rich Document Understanding☆30Dec 18, 2025Updated 6 months ago
- Official implementation for the paper "Self-Play Reinforcement Learning for Fast Image Retargeting"☆10Oct 5, 2020Updated 5 years ago
- ☆12Feb 17, 2017Updated 9 years ago
- 技術書のサポートページです☆10Aug 21, 2020Updated 5 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- My school projects.☆11Jan 10, 2024Updated 2 years ago
- Source code of the TextLap model, a LLM for text-2-layout generation.☆18Oct 21, 2024Updated last year
- ☆28Aug 22, 2025Updated 10 months ago
- [ICLR 2025] EditRoom: LLM-parameterized Graph Diffusion for Composable 3D Room Layout Editing☆27Apr 1, 2025Updated last year
- Code/data of the paper "Hand-Object Contact Prediction via Motion-Based Pseudo-Labeling and Guided Progressive Label Correction" (BMVC202…☆17Oct 22, 2021Updated 4 years ago
- Segmentation of text in manga images☆141Feb 6, 2021Updated 5 years ago
- Dual Fusion-Propagation Graph Neural Network for Multi-View Clustering☆16Sep 15, 2021Updated 4 years ago
- Theano☆11Aug 26, 2017Updated 8 years ago
- Repository for the KVP10k dataset☆23Sep 18, 2025Updated 9 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Code for the ECCV 2020 paper: `Look here! A learning based approach to redirect visual attention'☆13Aug 19, 2020Updated 5 years ago
- Official implementation of Generative Colorization of Structured Mobile Web Pages, WACV 2023.☆22Dec 7, 2023Updated 2 years ago
- ☆17Mar 24, 2025Updated last year
- ☆24Oct 13, 2024Updated last year
- ☆12Jun 18, 2021Updated 5 years ago
- [ICML 2026] 🏂 World Guidance: World Modeling in Condition Space for Action Generation☆142Apr 28, 2026Updated 2 months ago
- ☆25Jul 31, 2024Updated last year