Official implementation of the paper: [EMNLP 2025] RICO: Improving Accuracy and Completeness in Image Recaptioning via Visual Reconstruction
☆21Dec 9, 2025Updated 6 months ago
Alternatives and similar repositories for RICO
Users that are interested in RICO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [NAACL 2024] LaDiC: Are Diffusion Models Really Inferior to Autoregressive Counterparts for Image-to-text Generation?☆42Jun 9, 2024Updated 2 years ago
- Official implementation of "Towards Distribution-Agnostic Generalized Category Discovery" (NIPS 2023)☆29Oct 21, 2023Updated 2 years ago
- UniEdit: A Unified Tuning-Free Framework for Video Motion and Appearance Editing☆120Apr 16, 2025Updated last year
- [ICLR 2026] "VideoReasonBench: Can MLLMs Perform Vision-Centric Complex Video Reasoning?", Yuanxin Liu, Kun Ouyang, Haoning Wu, Yi Liu, L…☆39Jan 30, 2026Updated 4 months ago
- [ICML 2026] Scripting Multi-Scene Videos with Time-Aware and Structural Audio-Visual Captions☆46Jun 2, 2026Updated last week
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆16Oct 2, 2022Updated 3 years ago
- [ACL 2023] Transforming Visual Scene Graphs to Image Captions☆10Dec 13, 2023Updated 2 years ago
- This repo explores how AMR to address tasks difficult for LLMs☆13Jan 15, 2024Updated 2 years ago
- [ICML 2026] Elastic Diffusion Transformer: Accelerating SOTA generation models (e.g., Qwen-Image, Hunyuan3d ) through adaptive computatio…☆44May 1, 2026Updated last month
- [CIKM 2025] Constraint Back-translation Improves Complex Instruction Following of Large Language Models☆19May 23, 2025Updated last year
- Waterbody style transfer of underwater imagery (JOE 2025)☆26Dec 12, 2025Updated 5 months ago
- ☆43Dec 15, 2025Updated 5 months ago
- Evaluating Knowledge Acquisition from Multi-Discipline Professional Videos☆71Sep 5, 2025Updated 9 months ago
- [ACM MM 2022] (Oral): Multi-Modal Experience Inspired AI Creation☆21Nov 27, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Code for our paper "Learning to Generate Unit Tests for Automated Debugging"☆18Mar 7, 2025Updated last year
- [IEEE TMM] Code for the paper "HRNeXt: High-Resolution Context Network for Crowd Pose Estimation"☆10Feb 24, 2023Updated 3 years ago
- Official PyTorch implementation for Revisiting LRP: Positional Attribution as the Missing Ingredient for Transformer Explainability [Neur…☆17Jul 7, 2025Updated 11 months ago
- PoseBH: Prototypical Multi-Dataset Training Beyond Human Pose Estimation☆24Jun 20, 2025Updated 11 months ago
- Self evolve extension for openclaw. Let your claw grow continuously.☆96Apr 12, 2026Updated last month
- ☆14Nov 14, 2023Updated 2 years ago
- [NeurIPS 2025, Spotlight]: Ambient-o: Training Good models with Bad Data.☆35Apr 6, 2026Updated 2 months ago
- [CVPR 2026] Official repository for "Reviving ConvNeXt for Efficient Convolutional Diffusion Models"☆66Mar 26, 2026Updated 2 months ago
- [EMNLP 2024] The official GitHub repo for the paper "Course-Correction: Safety Alignment Using Synthetic Preferences"☆20Oct 2, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Code for "Spatial-Aware Regression for Keypoint Localization", CVPR 2024 Highlight☆19Jun 15, 2024Updated last year
- ☆34Jul 15, 2025Updated 10 months ago
- ☆21Mar 3, 2026Updated 3 months ago
- Github Pages template for academic portfolio websites☆17Oct 22, 2024Updated last year
- [ICLR'25] SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints☆689May 23, 2025Updated last year
- Macro-from-Micro Planning for High-Quality and Parallelized Autoregressive Long Video Generation☆39Oct 31, 2025Updated 7 months ago
- [ICCV 2025] Official Implementation of RefEdit: A Benchmark and Method for Improving Instruction-based Image Editing Model for Referring …☆20Jun 27, 2025Updated 11 months ago
- High-Resolution Visual Reasoning via Multi-Turn Grounding-Based Reinforcement Learning☆54Jul 23, 2025Updated 10 months ago
- SWE-Exp: Experience-Driven Software Issue Resolution☆40Oct 17, 2025Updated 7 months ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Paper: “MEMRL: SELF-EVOLVING AGENTS VIA RUNTIME REINFORCEMENT LEARNING ON EPISODIC MEMORY” Open-Source Code☆130May 2, 2026Updated last month
- ☆12May 21, 2019Updated 7 years ago
- Codebase for VideoConviction, accepted at KDD 2025 (D&B Track)☆18Jan 22, 2026Updated 4 months ago
- Official implementation for 'SDPose: Tokenized Pose Estimation via Circulation-Guide Self-Distillation' on CVPR 2024☆18May 15, 2024Updated 2 years ago
- 服务器 GPU 监控程序,当 GPU 属性满足预设条件时通过微信发送提示消息☆34Aug 10, 2021Updated 4 years ago
- [ECCV 2024] GTPT: Group-based Token Pruning Transformer for Efficient Human Pose Estimation☆19Oct 5, 2024Updated last year
- Official repository of SoftREPA: Aligning Text to Image in Diffusion Models is Easier Than You Think☆24Jun 5, 2025Updated last year