List of papers about Large Multimodal model
☆31May 31, 2025Updated 10 months ago
Alternatives and similar repositories for Awesome-LVLM-paper
Users that are interested in Awesome-LVLM-paper are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [EMNLP 2024] Official code for "Beyond Embeddings: The Promise of Visual Table in Multi-Modal Models"☆20Oct 17, 2024Updated last year
- Geometric Problem Solving Integrating FormalGeo Symbolic System and Hypergraph Neural Network.☆15Sep 23, 2025Updated 6 months ago
- 自己阅读的多模态对话系统论文(及部分笔记)汇总☆22Jan 5, 2023Updated 3 years ago
- [WACV2023] This is the official PyTorch impelementation of our paper "[Rethinking Rotation in Self-Supervised Contrastive Learning: Adapt…☆12Feb 24, 2023Updated 3 years ago
- Official implementation for DenseMixer: Improving MoE Post-Training with Precise Router Gradient☆66Aug 3, 2025Updated 8 months ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- LLM Reasoning Benchmark & Chain-of-Thoughts Dataset for Chemistry☆49Oct 9, 2025Updated 6 months ago
- ☆19May 14, 2024Updated last year
- ☆39Mar 19, 2026Updated last month
- ☆11May 24, 2024Updated last year
- FlashSampling: Fast and Memory-Efficient Exact Sampling (https://huggingface.co/papers/2603.15854)☆66Apr 9, 2026Updated last week
- 多周期54条MIPS指令CPU,通过前后仿真及下板验证☆11Jul 13, 2021Updated 4 years ago
- ☆96Mar 29, 2019Updated 7 years ago
- Used for thinking process intervention of reasoning models such as DeepSeek-R1, effectively controlling the reasoning thinking process. 用…☆24Apr 14, 2025Updated last year
- ☆18Nov 3, 2025Updated 5 months ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- 给科研小白的一些资源与工具推荐☆17Jul 6, 2020Updated 5 years ago
- TaGAT For Multi-modal Retinal Image Fusion☆36Jul 31, 2024Updated last year
- Official eval code for ROVER: Benchmarking Reciprocal Cross-Modal Reasoning for Omnimodal Generation☆27Dec 12, 2025Updated 4 months ago
- Dimple, the first Discrete Diffusion Multimodal Large Language Model☆117Jul 9, 2025Updated 9 months ago
- [TGRS 2024]Diff-Mosaic: Augmenting Realistic Representations in Infrared Small Target Detection via Diffusion Prior☆20Sep 18, 2025Updated 7 months ago
- ICCV2023论文代码汇总☆18Aug 12, 2023Updated 2 years ago
- ☆12Apr 18, 2025Updated last year
- Structuring Hour-Long Videos into Navigable Chapters and Hierarchical Summaries☆40Nov 19, 2025Updated 5 months ago
- Envision: Benchmarking Unified Understanding & Generation for Causal World Process Insights☆32Jan 9, 2026Updated 3 months ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- CoCo: Code as CoT for Text-to-Image Preview and Rare Concept Generation☆50Apr 9, 2026Updated last week
- Collected the world's best computer vision labs and lecture materials.☆14Feb 23, 2025Updated last year
- ☆11Aug 10, 2024Updated last year
- PyTorch implementation of "Sample- and Parameter-Efficient Auto-Regressive Image Models" from CVPR 2025☆14Nov 21, 2025Updated 4 months ago
- [CVPR 2025] EchoWorld: Learning Motion-Aware World Models for Echocardiography Probe Guidance☆44Apr 18, 2025Updated last year
- Pytorch implementation for the pilot study on the robustness of latent diffusion models.☆12Jun 20, 2023Updated 2 years ago
- [CVPR2026 Highlight] Cubic Discrete Diffusion: Discrete Visual Generation on High-Dimensional Representation Tokens https://arxiv.org/abs…☆53Apr 10, 2026Updated last week
- This paper is currently under review by IEEE TCSVT, and the diffusion framework of the FedDiff algorithm part will be disclosed.☆14Mar 8, 2024Updated 2 years ago
- Lua☆58Aug 22, 2018Updated 7 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- official repo for paper "[CLS] Token Tells Everything Needed for Training-free Efficient MLLMs"☆22Apr 23, 2025Updated 11 months ago
- [EMNLP'23] The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''☆115Aug 21, 2025Updated 7 months ago
- The official GitHub page for paper "NegativePrompt: Leveraging Psychology for Large Language Models Enhancement via Negative Emotional St…☆25May 10, 2024Updated last year
- [ECCV2024] Official code implementation of Merlin: Empowering Multimodal LLMs with Foresight Minds☆96Jul 4, 2024Updated last year
- A PyTorch implementation of CARAFE based on ICCV 2019 paper “CARAFE: Content-Aware ReAssembly of FEatures”☆18May 21, 2020Updated 5 years ago
- Thinking with Videos from Open-Source Priors. We reproduce chain-of-frames visual reasoning by fine-tuning open-source video models. Give…☆221Oct 12, 2025Updated 6 months ago
- The official pytorch implementation of “Diffusion Model as a Noise-Aware Latent Reward Model for Step-Level Preference Optimization”.☆19May 22, 2025Updated 10 months ago