PLippmann / multimodal-manga-translationLinks
Context-Informed Machine Translation of Manga using Multimodal Large Language Models
☆11Updated 9 months ago
Alternatives and similar repositories for multimodal-manga-translation
Users that are interested in multimodal-manga-translation are comparing it to the libraries listed below
Sorting:
- ☆12Updated 2 months ago
- [WACV2025] source code of StrDA: https://arxiv.org/abs/2410.09913☆11Updated 5 months ago
- ☆10Updated 10 months ago
- 2D-TPE: Two-Dimensional Positional Encoding Enhances Table Understanding for Large Language Models (WWW 2025)☆10Updated 5 months ago
- Accelerating GOT-OCRv2 with VLLM☆10Updated 10 months ago
- Various video readers for PyTorch models training and a benchmark☆11Updated 3 weeks ago
- UFPR-VCR: a dataset for vehicle color recognition that includes 10,039 images of vehicles in a wide range of real-world conditions, such …☆10Updated 11 months ago
- [⭐️ WACV 2025 Oral ⭐️] PETALface: Parameter Efficient Transfer Learning for Low-resolution Face Recognition☆19Updated 3 months ago
- 采用知识图谱和上下文检索显著提高信息检索的精度☆10Updated 10 months ago
- KV cache compression via sparse coding☆14Updated 4 months ago
- This is the official repository for "Can GPTs Evaluate Graphic Design Based on Design Principles?".☆12Updated 7 months ago
- Code Release for "MaskTerial: A Foundation Model for Automated 2D Material Flake Detection"☆13Updated 2 months ago
- A bot that provides Youtube vid chapters on Twitter (a.k.a. X )☆12Updated 7 months ago
- Demo tutorial on how to program in Python an autonomous bot that plays the GeoGuessr game, using different Vision LLMs with LangChain☆11Updated 11 months ago
- 中医舌苔分类检测系统源码和数据集:改进yolo11-bifpn-SDI☆22Updated 10 months ago
- ☆12Updated last month
- The implementation for FREE-Merging: Fourier Transform for Model Merging with Lightweight Experts (ICCV25)☆10Updated 3 months ago
- stochastic bfloat16 based optimizer library☆19Updated 9 months ago
- 本项目主要是2025届浙江大学软件学院夏令营(AI营)的考核项目☆11Updated 6 months ago
- 🫧 Code for Holistic Reasoning with Long-Context LMs: A Benchmark for Database Operations on Massive Textual Data (Maekawa*, Iso* et al.…☆12Updated 7 months ago
- ☆12Updated 4 months ago
- [CVPR2025] VDocRAG: Retirval-Augmented Generation over Visually-Rich Documents☆40Updated 4 months ago
- Enhancing Legal Case Retrieval via Scaling High-quality Synthetic Query-Candidate Pairs (EMNLP 2024)☆14Updated 10 months ago
- ☆14Updated 11 months ago
- SPRINT: Script-agnostic Structure Recognition in Tables☆13Updated 6 months ago
- ☆40Updated 7 months ago
- unofficial☆11Updated 11 months ago
- (ICCV 2025) OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation☆87Updated 2 months ago
- (Pattern Recognition) Pytorch implementation of “HTR-VT: Handwritten Text Recognition with Vision Transformer”☆96Updated last month
- Official codebase for "Analyzing the Generalization and Reliability of Steering Vectors"☆15Updated 9 months ago