emanuelevivoli/awesome-comics-understanding

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/emanuelevivoli/awesome-comics-understanding)

emanuelevivoli / awesome-comics-understanding

The official repo of the Comics Survey: "A missing piece in Vision and Language: A Survey on Comics Understanding"

☆139

Alternatives and similar repositories for awesome-comics-understanding

Users that are interested in awesome-comics-understanding are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ganjiro / OfflineMania
View on GitHub
[COG24] - Official repository of "OfflineMania: A Benchmark Environment for Offline Reinforcement Learning in Racing Games"
☆12Jul 15, 2024Updated 2 years ago
emanuelevivoli / CoMix-dataset
View on GitHub
Repository for "CoMix: Comprehensive Benchmark for Multi-Task Comic Understanding"
☆18Nov 20, 2024Updated last year
miccunifi / KDPL
View on GitHub
[ECCV 2024] - Improving Zero-shot Generalization of Learned Prompts via Unsupervised Knowledge Distillation
☆62Feb 20, 2026Updated 5 months ago
furkanbiten / object-bias
View on GitHub
Let there be clock in the beach - WACV 2022
☆15Nov 15, 2021Updated 4 years ago
ayanban011 / SVGCraft
View on GitHub
[WACV 2026 Round 1] Beyond Single Object Text-to-SVG Synthesis with Comprehensive Canvas Layout
☆24Oct 11, 2025Updated 9 months ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
dali92002 / HTRbyMatching
View on GitHub
Hadwritten Text Recognition in Few-shot Scenario
☆22Mar 25, 2023Updated 3 years ago
NiccoBiondi / ContrastiveSupervisedDistillation
View on GitHub
This repo contains the code of "Contrastive Supervised Distillation for Continual Representation Learning", Tommaso Barletti, Niccolò Bio…
☆20Jul 5, 2022Updated 4 years ago
AndresPMD / semantic_adaptive_margin
View on GitHub
WACV 2022 Paper - Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching
☆16Dec 10, 2021Updated 4 years ago
miccunifi / ISSUES
View on GitHub
[ICCVW 2023] - Mapping Memes to Words for Multimodal Hateful Meme Classification
☆27Apr 17, 2025Updated last year
simomagi / elastic_feature_consolidation
View on GitHub
[ICLR 2024] - Elastic Feature Consolidation for Cold Start Exemplar-Free Incremental Learning
☆34May 26, 2025Updated last year
miccunifi / CIRCO
View on GitHub
[ICCV 2023] - Composed Image Retrieval on Common Objects in context (CIRCO) dataset
☆87Aug 6, 2025Updated 11 months ago
dali92002 / OCR-TR
View on GitHub
Optocal Character Recognition (OCR / HTR) using Transformers
☆11Aug 20, 2022Updated 3 years ago
LorenzoAgnolucci / Keyframes-GAN
View on GitHub
[IEEE TMM 2023] This is the official repo of the paper "Perceptual Quality Improvement in Videoconferencing using Keyframes-based GAN".
☆17Dec 10, 2024Updated last year
aimagelab / HWD
View on GitHub
☆27Mar 7, 2025Updated last year
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
VLR-CVC / DocVQA2026
View on GitHub
Official evaluation scripts and baseline prompts for the DocVQA 2026 (ICDAR 2026) Competition on Multimodal Reasoning over Documents.
☆16Mar 16, 2026Updated 4 months ago
biswassanket / synth_doc_generation
View on GitHub
Official PyTorch Implementation of DocSynth: A Layout Guided Approach for Controllable Document Image Synthesis - ICDAR 2021
☆93Jul 16, 2021Updated 5 years ago
Marchetz / MANTRA-CVPR20
View on GitHub
Official Pytorch code for MANTRA - Memory Augmented Neural Trajectory Predictor (CVPR2020)
☆78Aug 24, 2022Updated 3 years ago
emanuelevivoli / ComiCap
View on GitHub
[ECCV-W] Official repo for the paper "ComiCap: A VLMs pipeline for dense captioning of Comic Panels"
☆15Nov 20, 2024Updated last year
dali92002 / SSL-OCR
View on GitHub
Text-DIAE: A Self-Supervised Degradation Invariant Autoencoders for Text Recognition and Document Enhancement - AAAI 2023
☆30Jul 12, 2023Updated 3 years ago
biswassanket / DocSegTr
View on GitHub
A Bottom-Up Instance Segmentation Strategy for segmenting document instances using Transformers
☆59Sep 9, 2024Updated last year
oronnir / CAST
View on GitHub
☆18Sep 14, 2024Updated last year
SonyResearch / IISA
View on GitHub
[ICCV 2025] - Image Intrinsic Scale Assessment: Bridging the Gap Between Quality and Resolution
☆17Aug 16, 2025Updated 11 months ago
furkanbiten / SelectiveTextStyleTransfer
View on GitHub
ICDAR 2019
☆25Aug 2, 2019Updated 6 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
ragavsachdeva / magi
View on GitHub
Generate a transcript for your favourite Manga: Detect manga characters, text blocks and panels. Order panels. Cluster characters. Match …
☆460Jun 27, 2025Updated last year
AlbinSou / unlearning-challenge-metric
View on GitHub
Attempt at reproducing the metric from Neurips 2023 Unlearning Challenge on Kaggle. Code for training checkpoints on retain set and unlea…
☆12Nov 8, 2023Updated 2 years ago
AndresPMD / Fine_Grained_Clf
View on GitHub
Based on the WACV 2020 paper - Fine Grained Classification and Retrieval by Combining Visual and Locally Pooled Textual Features
☆25Nov 15, 2021Updated 4 years ago
miccunifi / QualiCLIP
View on GitHub
Quality-Aware Image-Text Alignment for Opinion-Unaware Image Quality Assessment
☆132Mar 10, 2025Updated last year
AndresPMD / Pytorch-yolo-phoc
View on GitHub
Implementation on pytorch of the code from the ECCV 2018 paper - Single Shot Scene Text Retrieval
☆13Dec 15, 2021Updated 4 years ago
amazon-science / textadain-robust-recognition
View on GitHub
TextAdaIN: Paying Attention to Shortcut Learning in Text Recognizers
☆21Jul 26, 2022Updated 4 years ago
miccunifi / SpectralGCD
View on GitHub
[ICLR 2026] - Spectral Concept Selection and Cross-modal Representation Learning for Generalized Category Discovery
☆23Mar 18, 2026Updated 4 months ago
AlbinSou / ocl_survey
View on GitHub
Code for "A Comprehensive Empirical Evaluation on Online Continual Learning" ICCVW 2023 VCL Workshop
☆45Apr 8, 2024Updated 2 years ago
dali92002 / DocEnTR
View on GitHub
DocEnTr: An end-to-end document image enhancement transformer - ICPR 2022
☆190Jan 17, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
dfki-av / AWT-for-CISS
View on GitHub
Official repository for our paper on "Attribution-aware Weight Transfer: A Warm-Start Initialization for Class-Incremental Semantic Segme…
☆12Jan 3, 2023Updated 3 years ago
OxRML / MADQA
View on GitHub
Multimodal Agentic Document QA benchmark (MADQA)
☆39Mar 13, 2026Updated 4 months ago
miccunifi / ARNIQA
View on GitHub
[WACV 2024 Oral] - ARNIQA: Learning Distortion Manifold for Image Quality Assessment
☆156Jun 18, 2026Updated last month
georgeretsi / Seq2Emb
View on GitHub
Create handwritten word embeddings from a text recognition Seq2Seq system.
☆11Dec 1, 2022Updated 3 years ago
miccunifi / Cross-the-Gap
View on GitHub
[ICLR 2025] - Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion
☆70Nov 30, 2025Updated 7 months ago
koninik / HTG_evaluation
View on GitHub
Official PyTorch Implementation of "Rethinking HTG Evaluation: Bridging Generation and Recognition" (Oral) - 1st Workshop on Critical Eva…
☆17Sep 23, 2024Updated last year
LorenzoGianassi / Land-Diffuser
View on GitHub
The Land-Diffuser is a novel application of the Denoising Diffusion Probabilistic Model (DDPM) in the realm of 3D Talking Head generation…
☆13Dec 23, 2023Updated 2 years ago