PLippmann / multimodal-manga-translationLinks
Context-Informed Machine Translation of Manga using Multimodal Large Language Models
☆11Updated 7 months ago
Alternatives and similar repositories for multimodal-manga-translation
Users that are interested in multimodal-manga-translation are comparing it to the libraries listed below
Sorting:
- ☆10Updated 2 weeks ago
- ☆10Updated 8 months ago
- [WACV2025] source code of StrDA: https://arxiv.org/abs/2410.09913☆11Updated 3 months ago
- 2D-TPE: Two-Dimensional Positional Encoding Enhances Table Understanding for Large Language Models (WWW 2025)☆10Updated 3 months ago
- [⭐️ WACV 2025 Oral ⭐️] PETALface: Parameter Efficient Transfer Learning for Low-resolution Face Recognition☆14Updated last month
- KV cache compression via sparse coding☆11Updated 2 months ago
- A bot that provides Youtube vid chapters on Twitter (a.k.a. X )☆12Updated 5 months ago
- An open-source server implementation for inference Qwen2-VL series model using fastapi.☆9Updated 8 months ago
- SAVEn-Vid: Synergistic Audio-Visual Integration for Enhanced Understanding in Long Video Context☆5Updated 7 months ago
- [ICLR 2025] Causal Graphical Models for Vision-Language Compositional Understanding☆9Updated 3 months ago
- stochastic bfloat16 based optimizer library☆16Updated 7 months ago
- PyTorch implementation of the paper: "What Do You See? Enhancing Zero-Shot Image Classification with Multimodal Vision-Language Models." …☆11Updated 4 months ago
- rasters and gexf network files for the not david power of friendship video☆13Updated 6 months ago
- ☆10Updated 3 weeks ago
- This is the official repository for "Can GPTs Evaluate Graphic Design Based on Design Principles?".☆12Updated 5 months ago
- 采用知识图谱和上下文检索显著提高信息检索的精度☆10Updated 8 months ago
- Accelerating GOT-OCRv2 with VLLM☆10Updated 8 months ago
- Code Release for "MaskTerial: A Foundation Model for Automated 2D Material Flake Detection"☆11Updated last week
- 🫧 Code for Holistic Reasoning with Long-Context LMs: A Benchmark for Database Operations on Massive Textual Data (Maekawa*, Iso* et al.…☆12Updated 5 months ago
- Official codebase for "Analyzing the Generalization and Reliability of Steering Vectors"☆14Updated 7 months ago
- The Official Code Repo for EgoOrientBench [CVPR25]☆13Updated 3 months ago
- Demo tutorial on how to program in Python an autonomous bot that plays the GeoGuessr game, using different Vision LLMs with LangChain☆11Updated 9 months ago
- Image processing tool for ComfyUI☆8Updated 2 months ago
- 基于selenium的SJTU体育场馆预约脚本☆10Updated 9 months ago
- Instagram Automation Tool is a framework that automates various Instagram tasks, including file-based operations and web automation (via …☆16Updated 2 months ago
- Enhancing Legal Case Retrieval via Scaling High-quality Synthetic Query-Candidate Pairs (EMNLP 2024)☆11Updated 8 months ago
- ☆13Updated 9 months ago
- Official repo of dataset-decomposition paper [NeurIPS 2024]☆19Updated 6 months ago
- 本项目主要是2025届浙江大学软件学院夏令营(AI营)的考核项目☆11Updated 4 months ago
- [NeurIPS 2024] IF-Font: Ideographic Description Sequence-Following Font Generation☆17Updated 4 months ago