List of papers about Large Multimodal model
☆31May 31, 2025Updated 9 months ago
Alternatives and similar repositories for Awesome-LVLM-paper
Users that are interested in Awesome-LVLM-paper are comparing it to the libraries listed below
Sorting:
- [ACM MM 2023] The released code of paper "Deconfounded Visual Question Generation with Causal Inference"☆11Sep 3, 2024Updated last year
- [EMNLP 2024] Official code for "Beyond Embeddings: The Promise of Visual Table in Multi-Modal Models"☆20Oct 17, 2024Updated last year
- Official implementation for DenseMixer: Improving MoE Post-Training with Precise Router Gradient☆66Aug 3, 2025Updated 7 months ago
- 学生选课系统☆11Mar 1, 2023Updated 3 years ago
- ☆11May 24, 2024Updated last year
- ROCT-Net: A new ensemble deep convolutional model with improved spatial resolution learning for detecting common diseases from retinal OC…☆12Mar 3, 2022Updated 4 years ago
- Envision: Benchmarking Unified Understanding & Generation for Causal World Process Insights☆32Jan 9, 2026Updated 2 months ago
- codes and plots for "Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs"☆10Dec 30, 2024Updated last year
- Micro-Attention for Micro-Expression Recognition☆10Mar 11, 2021Updated 4 years ago
- ☆12Apr 18, 2025Updated 10 months ago
- [WACV2023] This is the official PyTorch impelementation of our paper "[Rethinking Rotation in Self-Supervised Contrastive Learning: Adapt…☆12Feb 24, 2023Updated 3 years ago
- Motion-sensing game control system based on bone point recognition☆10Dec 1, 2023Updated 2 years ago
- 抓去特定某一条微博的评论☆12Sep 22, 2017Updated 8 years ago
- ☆10Aug 10, 2024Updated last year
- This paper is currently under review by IEEE TCSVT, and the diffusion framework of the FedDiff algorithm part will be disclosed.☆14Mar 8, 2024Updated 2 years ago
- ☆10Jun 20, 2025Updated 8 months ago
- ☆13May 31, 2024Updated last year
- An implementation of EMMA (End-to-End Multimodal Model for Autonomous Driving) using the Claude API, based on the EMMA paper.☆12Dec 14, 2024Updated last year
- Pytorch implementation for the pilot study on the robustness of latent diffusion models.☆13Jun 20, 2023Updated 2 years ago
- ☆29Dec 22, 2025Updated 2 months ago
- Structuring Hour-Long Videos into Navigable Chapters and Hierarchical Summaries☆35Nov 19, 2025Updated 3 months ago
- This code is submitted to ICCV Workshop 2017: Fake vs. true facial emotion recognition competition☆11Oct 17, 2017Updated 8 years ago
- C3D,R(21)D,R3D--pytorch☆10Sep 11, 2018Updated 7 years ago
- 日常学习笔记☆12May 11, 2023Updated 2 years ago
- 北京大学软件与微电子学院 PKU-SS 课程分享☆20Mar 1, 2024Updated 2 years ago
- PyTorch implementation of "Sample- and Parameter-Efficient Auto-Regressive Image Models" from CVPR 2025☆14Nov 21, 2025Updated 3 months ago
- TaGAT For Multi-modal Retinal Image Fusion☆11Jul 31, 2024Updated last year
- ☆22May 26, 2025Updated 9 months ago
- Official code for "Federated Weakly Supervised Video Anomaly Detection with Multimodal Prompt" (AAAI2025)☆25May 27, 2025Updated 9 months ago
- SW components and demos for visual kinship recognition. An emphasis is put on the FIW dataset-- data loaders, benchmarks, results in summ…☆17Mar 13, 2023Updated 2 years ago
- [NeurIPS 2025] Elevating Visual Perception in Multimodal LLMs with Visual Embedding Distillation☆71Oct 17, 2025Updated 4 months ago
- Code for a New Loss for Mitigating the Bias of Learning Difficulties in Generative Language Models☆68Feb 18, 2025Updated last year
- Tracking Pedestrians using HOG Features and a Particle Filter☆12Oct 1, 2014Updated 11 years ago
- ☆75Mar 18, 2025Updated 11 months ago
- [CVPR 2024] "Data Poisoning based Backdoor Attacks to Contrastive Learning": official code implementation.☆16Feb 10, 2025Updated last year
- ☆31Sep 14, 2025Updated 5 months ago
- keep updated on recent advances of Temporal Action Localization☆12May 1, 2020Updated 5 years ago
- Lua☆58Aug 22, 2018Updated 7 years ago
- 🔥Awesome Multimodal Large Language Models Paper List☆155Mar 12, 2025Updated 11 months ago