[ICCV 2025] Preacher: Paper-to-Video Agentic System
☆41Sep 1, 2025Updated 6 months ago
Alternatives and similar repositories for Paper2Video
Users that are interested in Paper2Video are comparing it to the libraries listed below
Sorting:
- TIGeR: Tool-Integrated Geometric Reasoning in Vision-Language Models for Robotics☆21Nov 18, 2025Updated 4 months ago
- PyTorch implementation of TinyWASE described in our paper "Compressing Speaker Extraction Model with Ultra-low Precision Quantization and…☆11Jun 28, 2021Updated 4 years ago
- Implementation of "Reinforcing the Diffusion Chain of Lateral Thought with Diffusion Language Models" [NeurIPS 2025]☆75Dec 17, 2025Updated 3 months ago
- Cut2Next: Generating Next Shot via In-Context Tuning☆31Aug 21, 2025Updated 7 months ago
- LoRA-One: One-Step Full Gradient Could Suffice for Fine-Tuning Large Language Models, Provably and Efficiently (ICML2025 Oral)☆29Oct 22, 2025Updated 4 months ago
- [TOG 2025] Order Matters: Learning Element Ordering for Graphic Design Generation☆24Aug 5, 2025Updated 7 months ago
- An unofficial implementation of Lite-RTSE, a cost-effective lite model for real-time speech enhancement☆14Nov 19, 2023Updated 2 years ago
- The power-law compressed phase-aware asymmetric (PLCPA-ASYM) loss☆14Sep 4, 2023Updated 2 years ago
- Official repository of "GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing"☆312Sep 28, 2025Updated 5 months ago
- [ICCV 2025] Prompt-A-Video☆22Feb 2, 2025Updated last year
- An unofficial non-causal Tensorflow implementation of "Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Spee…☆14Dec 27, 2022Updated 3 years ago
- ☆15Sep 16, 2024Updated last year
- ☆18Aug 23, 2024Updated last year
- ☆23Jan 1, 2026Updated 2 months ago
- ☆13Mar 8, 2020Updated 6 years ago
- ☆13Feb 18, 2023Updated 3 years ago
- Official Code Release for "Diagnosing and Rectifying Vision Models using Language" (ICLR 2023)☆34Jun 8, 2023Updated 2 years ago
- ☆12May 22, 2023Updated 2 years ago
- ☆17Mar 30, 2023Updated 2 years ago
- Code for "CREAM: Consistency Regularized Self-Rewarding Language Models", ICLR 2025.☆29Feb 17, 2025Updated last year
- Implementation of the spotlight: a method for discovering systematic errors in deep learning models☆11Oct 5, 2021Updated 4 years ago
- ☆42Mar 26, 2025Updated 11 months ago
- The PackNet Continual Learning Method in Pytorch☆15Aug 19, 2021Updated 4 years ago
- [ICLR 2026] Official code for TraceRL: Revolutionizing post-training for Diffusion LLMs, powering the SOTA TraDo series.☆464Jan 28, 2026Updated last month
- Unlocking the Essence of Beauty: Advanced Aesthetic Reasoning with Relative-Absolute Policy Optimization☆23Jan 27, 2026Updated last month
- [MICCAI 2023] Official implementation of our MICCAI 2023 paper "Pick the Best Pre-trained Model: Towards Transferability Estimation for M…☆13Jul 27, 2023Updated 2 years ago
- ☆14Oct 12, 2023Updated 2 years ago
- ☆16Sep 12, 2023Updated 2 years ago
- Heterogeneous Model Reuse via Optimizing Multiparty Multiclass Margin☆11Jan 15, 2020Updated 6 years ago
- Official code repo for our work "Native Visual Understanding: Resolving Resolution Dilemmas in Vision-Language Models"☆54Jun 17, 2025Updated 9 months ago
- Official PyTorch implementation of CorrespondentDream: Enhancing 3D Fidelity of Text-to-3D using Cross-View Correspondences (CVPR 2024 Po…☆19Apr 29, 2024Updated last year
- Source code for "MEDIMP: 3D Medical Images with clinical Prompts from limited tabular data for renal transplantation", MIDL 2023, https:/…☆10Apr 29, 2023Updated 2 years ago
- Implementation of Attention-based Fusion for Multi-source Human Image Generation, S. Lathuilière, E. Sangineto, A. Siarohin, N. Sebe, WAC…☆10Oct 9, 2020Updated 5 years ago
- ☆16Apr 28, 2023Updated 2 years ago
- 来自于文章Paraformer-v2: An improved non-autoregressive transformer for noise-robust speech recognition☆27Nov 20, 2024Updated last year
- EmoCapCLIP: Learning Transferable Facial Emotion Representations from Large-Scale Semantically Rich Captions☆20Jul 29, 2025Updated 7 months ago
- Advances in recent large vision language models (LVLMs)☆15Sep 23, 2024Updated last year
- Non-Uniform FFT on the CPU and GPU (1D, 2D and 3D)☆14Jan 13, 2021Updated 5 years ago
- ☆11Jan 29, 2023Updated 3 years ago