XMUDeepLIT / Translatotron-VView external linksLinks
Code for "Translatotron-V(ison): An End-to-End Model for In-Image Machine Translation" (Findings of ACL 2024)
☆16Jul 4, 2024Updated last year
Alternatives and similar repositories for Translatotron-V
Users that are interested in Translatotron-V are comparing it to the libraries listed below
Sorting:
- ☆12Apr 18, 2025Updated 9 months ago
- ☆23Jun 26, 2025Updated 7 months ago
- java implementation of Bert Tokenizer, support output onnx tensor for onnx model inference☆12Sep 4, 2023Updated 2 years ago
- The implementation of Text Classification with Negative Supervision (ACL, 2020)☆10Oct 8, 2020Updated 5 years ago
- 【2024 ECAI】First Creating Backgrounds Then Rendering Texts: A New Paradigm for Visual Text Blending☆14Jun 16, 2025Updated 7 months ago
- Zeta implementation of a reusable and plug in and play feedforward from the paper "Exponentially Faster Language Modeling"☆16Nov 11, 2024Updated last year
- The official repository of "Document Image Machine Translation with Dynamic Multi-pre-trained Models Assembling"☆12Nov 26, 2025Updated 2 months ago
- [CVPR' 25] Official implementation of the paper "Pseudo Visible Feature Fine-Grained Fusion for Thermal Object Detection"☆22Aug 29, 2025Updated 5 months ago
- 签证官揭开关于美国学生签证申请的谣言☆10May 30, 2018Updated 7 years ago
- MeloTTS demo on Axera☆10Nov 18, 2025Updated 2 months ago
- 使用Sentencepiece对中文语料进行分词☆13Nov 30, 2023Updated 2 years ago
- ☆34Jan 10, 2026Updated last month
- ☆13Sep 25, 2023Updated 2 years ago
- 👂 Typing is slow, talk to me. The project name means ' i am tired ' in Chinese (我累了). This is a AI efficiency assistant, complete your d…☆16Jun 8, 2024Updated last year
- Vision-Language Pre-Training for Boosting Scene Text Detectors (CVPR2022)☆12Mar 21, 2022Updated 3 years ago
- ☆16Jan 31, 2025Updated last year
- IA004区块链BP小组☆13Apr 19, 2019Updated 6 years ago
- ☆17Jul 5, 2022Updated 3 years ago
- Implementation of LTC-SUM: Lightweight Client-driven Personalized Video Summarization Framework Using 2D CNN☆22Jul 11, 2023Updated 2 years ago
- Implementation of Baseline for Scene Text-to-Scene Text Translation☆18Mar 30, 2025Updated 10 months ago
- codes for Efficient Test-Time Scaling via Self-Calibration☆19Sep 13, 2025Updated 5 months ago
- Code for IEEE Trans. on Multimedia (TMM) paper "Object-aware Multimodal Named Entity Recognition in Social Media Posts with Adversarial L…☆20Mar 3, 2021Updated 4 years ago
- ☆19Aug 9, 2024Updated last year
- ☆18Jul 25, 2024Updated last year
- ☆21Aug 27, 2025Updated 5 months ago
- I designed an intelligent system capable of analyzing movement within the videos and detecting suspicious movement that precedes the occu…☆22May 5, 2024Updated last year
- AnyTrans: Translate AnyText in the Image with Large Scale Models (EMNLP2024 Findings)☆25Dec 11, 2024Updated last year
- ☆23Aug 20, 2024Updated last year
- Code for the paper, Temporal Action Localization with Enhanced Instant Discriminability☆28Mar 25, 2024Updated last year
- The official implementation of RS-STE proposed by our paper Recognition-Synergistic Scene Text Editing (CVPR 2025).☆29Jul 15, 2025Updated 6 months ago
- ☆50Jan 29, 2026Updated 2 weeks ago
- This is a web-based intelligent dialogue program built using ASR, LLM, and TTS.☆24Dec 3, 2024Updated last year
- Official implementation of Data-Free Sketch-Based Image Retrieval, CVPR 2023.☆27Sep 8, 2023Updated 2 years ago
- Chinese-Mimi 是对 Moshi 模型的声码器进行了中文语料上的适配。☆34Mar 13, 2025Updated 11 months ago
- Solution of The AI City Challenge 2022 Track 1☆26Apr 26, 2022Updated 3 years ago
- ☆30Mar 13, 2024Updated last year
- [ECCV'24 Workshops Oral] DALDA: Data Augmentation Leveraging Diffusion Model and LLM with Adaptive Guidance Scaling☆31Feb 6, 2026Updated last week
- [MICCAI 2025] TAT: Task-Adaptive Transformer for All-in-One Medical Image Restoration☆35Dec 16, 2025Updated last month
- ☆29Jul 4, 2024Updated last year