qzp2018/AnyTrans

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/qzp2018/AnyTrans)

qzp2018 / AnyTrans

AnyTrans: Translate AnyText in the Image with Large Scale Models (EMNLP2024 Findings)

☆25

Alternatives and similar repositories for AnyTrans

Users that are interested in AnyTrans are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Lee-zixu / FineCIR
View on GitHub
☆12Mar 31, 2025Updated last year
XMUDeepLIT / mc_tit
View on GitHub
Code for ACL 2023 paper: Exploring Better Text Image Translation with Multimodal Codebook
☆21Apr 19, 2026Updated 3 months ago
qzp2018 / MCLN
View on GitHub
This is a PyTorch implementation of MCLN proposed by our paper "Multi-branch Collaborative Learning Network for 3D Visual Grounding"(ECCV…
☆27Oct 10, 2024Updated last year
qzp2018 / UniECS
View on GitHub
Official implement of CIKM2025: 《UniECS: Unified Multimodal E-Commerce Search Framework with Gated Cross-modal Fusion》
☆21Sep 17, 2025Updated 10 months ago
80chen86 / IPDN
View on GitHub
☆17Dec 25, 2025Updated 6 months ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
Leon1207 / 3DRefTR
View on GitHub
This is a PyTorch implementation of 3DRefTR proposed by our paper "A Unified Framework for 3D Point Cloud Visual Grounding"
☆26Aug 24, 2023Updated 2 years ago
xu-shitong / TSE-through-Positive-Negative-Enroll
View on GitHub
Official implementation of paper "Target Speaker Extraction through Comparing Noisy Positive and Negative Audio Enrollments"
☆21Updated this week
sosppxo / RG-SAN
View on GitHub
[NeurIPS 2024 Oral] RG-SAN: Rule-Guided Spatial Awareness Network for End-to-End 3D Referring Expression Segmentation
☆20Dec 22, 2024Updated last year
nini0919 / SemiRES
View on GitHub
[ICML2024]The official implementation of SemiRES in PyTorch.
☆33Jun 20, 2024Updated 2 years ago
yhzhu99 / tutorials
View on GitHub
tutorials
☆22Aug 12, 2022Updated 3 years ago
anton-jeran / AV-RIR
View on GitHub
Audio-Visual Room Impulse Response Estimation
☆25Jul 22, 2024Updated 2 years ago
violet-zct / fairseq-dro-mnmt
View on GitHub
☆14Sep 10, 2021Updated 4 years ago
Chiangsonw / CaLa
View on GitHub
The official code of "CaLa: Complementary Association Learning for Augmenting Composed Image Retrieval"
☆15Sep 19, 2024Updated last year
hsing-wang / WMT2020_BioMedical
View on GitHub
☆15Jul 16, 2021Updated 5 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
ttgeng233 / UniAV
View on GitHub
Unified Audio-Visual Perception for Multi-Task Video Localization
☆33Apr 19, 2024Updated 2 years ago
fsssosei / similarity_index_of_label_graph
View on GitHub
This is the package used to calculate the similarity index of the label graph pairs.
☆13Nov 4, 2020Updated 5 years ago
scofield7419 / UMMT-VSH
View on GitHub
Code for the ACL 2023 paper Scene Graph as Pivoting: Inference-time Image-free Unsupervised Multimodal Machine Translation with Visual Sc…
☆12May 19, 2023Updated 3 years ago
DaehanKim-Korea / VisDA2022_1st_Place_Solution
View on GitHub
☆11Jun 3, 2023Updated 3 years ago
RoyZhao926 / InstructBrush
View on GitHub
Official repository of the paper InstructBrush: Learning Attention-based Instruction Optimization for Image Editing
☆16Apr 14, 2024Updated 2 years ago
Toloka / WSDMCup2023
View on GitHub
Toloka Visual Question Answering Challenge at WSDM Cup 2023
☆30May 1, 2024Updated 2 years ago
XL2248 / SOV-MAS
View on GitHub
The code and data for "Summary-Oriented Vision Modeling for Multimodal Abstractive Summarization"
☆11May 16, 2023Updated 3 years ago
PRIS-CV / Category-Specific-Prompt
View on GitHub
Code release for "Category-Specific Prompts for Animal Action Recognition with Pretrained Vision-Language Models"
☆14Feb 21, 2024Updated 2 years ago
Mutoy-choi / Tryondiffusion
View on GitHub
☆54Jul 30, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
iLearn-Lab / SIGIR24-DQU-CIR
View on GitHub
[SIGIR 2024] - Simple but Effective Raw-Data Level Multimodal Fusion for Composed Image Retrieval
☆44Jul 14, 2024Updated 2 years ago
ldynx / SAVE
View on GitHub
☆25Nov 22, 2024Updated last year
kojima-takeshi188 / CFA
View on GitHub
☆12Jul 21, 2022Updated 4 years ago
aleXiehta / AD-FlowTSE
View on GitHub
Adaptive Flow-Matching for Target Speaker Extraction
☆39Jul 13, 2026Updated last week
Disguiser15 / RefTeacher
View on GitHub
RefTeacher is a strong baseline method for Semi-Supervised Referring Expression Comprehension.
☆14May 26, 2023Updated 3 years ago
Eurus-Holmes / SynthText_CH
View on GitHub
[SynthText Chinese] Improved code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural I…
☆14Dec 8, 2022Updated 3 years ago
Cuberick-Orion / Candidate-Reranking-CIR
View on GitHub
The official implementation for Candidate Set Re-ranking for Composed Image Retrieval (TMLR) 01/2024
☆20Feb 7, 2024Updated 2 years ago
xdxie / WAS_WordArt-Segmentation
View on GitHub
The official codes and datasets for Artistic Text Segmentation (ECCV 2024).
☆30Sep 24, 2025Updated 10 months ago
VIM-Bench / VIM_TOOL
View on GitHub
☆12Jun 12, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
KoDohwan / VT-TWINS
View on GitHub
Video-Text Representation Learning via Differentiable Weak Temporal Alignment (PyTorch implementation for the CVPR 2022 paper)
☆11Oct 12, 2022Updated 3 years ago
hellloxiaotian / SWCNN
View on GitHub
A self-supervised CNN for image watermark removal (IEEE Transactions on Circuits and Systems for Video 2024)
☆74Aug 21, 2024Updated last year
facebookresearch / MoCA
View on GitHub
Motion-conditional image animation for video editing
☆20Dec 2, 2023Updated 2 years ago
dongjunKANG / VIM
View on GitHub
☆11Oct 16, 2023Updated 2 years ago
sauradip / MUPPET
View on GitHub
[ Arxiv 2023 ] This repository contains the code for "MUPPET: Multi-Modal Few-Shot Temporal Action Detection"
☆16Aug 30, 2023Updated 2 years ago
VisuLogic-Benchmark / VisuLogic-Eval
View on GitHub
☆37Aug 18, 2025Updated 11 months ago
JustinYuu / MM_Pyramid
View on GitHub
[ACM MM 2022] MM_Pyramid: Multimodal Pyramid Attentional Network for Audio-Visual Event Localization and Video Parsing
☆15Aug 26, 2022Updated 3 years ago