manga109 / public-annotationsView external linksLinks
Various annotations of Manga109 dataset
☆13Apr 23, 2025Updated 9 months ago
Alternatives and similar repositories for public-annotations
Users that are interested in public-annotations are comparing it to the libraries listed below
Sorting:
- Official repository of Manga109Dialog (ICME 2024)☆26Aug 3, 2024Updated last year
- Official repository of "Zero-Shot Character Identification and Speaker Prediction in Comics via Iterative Multimodal Fusion" (ACMMM 2024)☆15Oct 31, 2024Updated last year
- Simple python API to read annotation data of Manga109☆128Mar 4, 2022Updated 3 years ago
- A Simple Framwork for CV Pre-training Model (SOCO, VirTex, BEiT)☆15Oct 18, 2021Updated 4 years ago
- Code for GLAT (Global Local Transformer), ECCV 2020 "Learning Visual Commonsense for Robust Scene Graph Generation"☆11Dec 16, 2020Updated 5 years ago
- CenterMask2 on detectron2 (open images)☆10May 28, 2020Updated 5 years ago
- 技術書のサポートページです☆10Aug 21, 2020Updated 5 years ago
- ☆12Apr 24, 2024Updated last year
- Yonsei Natural Language Understanding tool☆12Dec 7, 2022Updated 3 years ago
- Repository for the paper "Data Efficient Masked Language Modeling for Vision and Language".☆18Sep 17, 2021Updated 4 years ago
- a autodl environment for native finetune stable diffusion.☆11Dec 7, 2022Updated 3 years ago
- ☆10Oct 24, 2016Updated 9 years ago
- OneFlow Diffusers Web UI☆11Apr 11, 2023Updated 2 years ago
- ☆14Mar 18, 2022Updated 3 years ago
- ☆10May 30, 2020Updated 5 years ago
- thesis slides repository for cvpaper.challenge☆11Apr 25, 2019Updated 6 years ago
- [ICML24] Official Implementation of "ETHER: Efficient Finetuning of Large-Scale Models with Hyperplane Reflections"☆16May 31, 2024Updated last year
- The official implement of "Grounded Chain-of-Thought for Multimodal Large Language Models"☆21Jul 21, 2025Updated 6 months ago
- Open source community's implementation of the model from "LANGUAGE MODEL BEATS DIFFUSION — TOKENIZER IS KEY TO VISUAL GENERATION"☆15Nov 11, 2024Updated last year
- Source code of the TextLap model, a LLM for text-2-layout generation.☆17Oct 21, 2024Updated last year
- Theano☆11Aug 26, 2017Updated 8 years ago
- Online Detection of Action Start in Untrimmed, Streaming Videos☆12Sep 1, 2018Updated 7 years ago
- Neural Image Assessment, a tool to automatically inspect quality of images.☆12Mar 1, 2022Updated 3 years ago
- Unofficial implementation of Face0 with SDXL☆12Sep 1, 2023Updated 2 years ago
- ☆15Jun 15, 2020Updated 5 years ago
- [ECCV'24 Oral] PiTe: Pixel-Temporal Alignment for Large Video-Language Model☆17Feb 13, 2025Updated last year
- ☆14Feb 9, 2023Updated 3 years ago
- ☆13Apr 7, 2022Updated 3 years ago
- Pytorch implementation of Single-Stage Multi-Person Pose Machines (ICCV'19)☆15Jan 15, 2020Updated 6 years ago
- Robustness properties of Facebook's ResNeXt WSL models☆15Dec 7, 2019Updated 6 years ago
- RLHF for Stable Diffusion☆14Jul 9, 2023Updated 2 years ago
- ☆12Dec 17, 2019Updated 6 years ago
- Code for the ECCV 2020 paper: `Look here! A learning based approach to redirect visual attention'☆13Aug 19, 2020Updated 5 years ago
- This is a repo of extension of VPN for Recognition of Activities of Daily Living☆16May 17, 2021Updated 4 years ago
- ☆14May 17, 2022Updated 3 years ago
- [CVPR'25] A vision question answering (VQA) benchmark for 6D spatial reasoning.☆20Jun 17, 2025Updated 7 months ago
- ☆15Jun 19, 2018Updated 7 years ago
- This is the implementation of our AURL paper "Alignment-Uniformity aware Representation Learning for Zero-shot Video Classification".☆15May 13, 2022Updated 3 years ago
- 用Python封装飞书文档API,直接读写文档☆21Nov 24, 2023Updated 2 years ago