《多模态大模型:新一代人工智能技术范式》作者:刘阳,林倞
☆266Mar 19, 2026Updated this week
Alternatives and similar repositories for Book-of-MLM
Users that are interested in Book-of-MLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [Embodied-AI-Survey-2025] Paper List and Resource Repository for Embodied AI☆1,948Mar 11, 2026Updated last week
- Embodied Question Answering (EQA) benchmark and method (ICCV 2025)☆48Aug 12, 2025Updated 7 months ago
- The official repository of [CVPR2025] DSPNet: Dual-vision Scene Perception for Robust 3D Question Answering☆25Apr 18, 2025Updated 11 months ago
- 大型语言模型实战指南:应用实践与场景落地☆88Sep 13, 2024Updated last year
- VisionGRU: A Linear-Complexity RNN Model for Efficient Image Analysis☆13Dec 26, 2024Updated last year
- [IEEE T-PAMI 2023] Cross-Modal Causal Relational Reasoning for Event-Level Visual Question Answering☆20Jul 6, 2023Updated 2 years ago
- 《基于BERT模型的自然语言处理实战》随书代码☆17Jun 13, 2022Updated 3 years ago
- ☆13Oct 23, 2023Updated 2 years ago
- Transferable Feature Representation for Visible-to-Infrared Cross-Dataset Human Action Recognition (Complexity 2018)☆13Dec 14, 2022Updated 3 years ago
- The official implementation of "Cross-modal Causal Relation Alignment for Video Question Grounding. (CVPR 2025 Highlight)"☆45Apr 27, 2025Updated 10 months ago
- 通用简单工具项目☆22Oct 6, 2024Updated last year
- The code for the book AAML by Abhishek Thakur☆13Sep 4, 2020Updated 5 years ago
- 从ICCV等网页上爬取论文列表,并获取ArXiv的相关资料☆14Oct 19, 2023Updated 2 years ago
- [IEEE T-IP 2022] TCGL: Temporal Contrastive Graph for Self-supervised Video Representation Learning☆24Dec 19, 2023Updated 2 years ago
- 3DAffordSplat: Efficient Affordance Reasoning with 3D Gaussians (ACM MM 25)☆74Jul 21, 2025Updated 8 months ago
- The collections of MOE (Mixture Of Expert) papers, code and tools, etc.☆12Mar 15, 2024Updated 2 years ago
- ☆84Updated this week
- 2025.01:从零到一实现了一个多模态大模型,并命名为Reyes(睿视),R:睿,eyes:眼。Reyes的参数量为8B,视觉编码器使用的是InternViT-300M-448px-V2_5,语言模型侧使用的是Qwen2.5-7B-Instruct,Reyes也通过一个两…☆33Feb 10, 2026Updated last month
- KMM: Key Frame Mask Mamba for Extended Motion Generation☆19Sep 22, 2025Updated 6 months ago
- 主要记录大语言大模型(LLMs) 算法(应用)工程师多模态相关知识☆272May 12, 2024Updated last year
- https://haa.boyuai.com☆61Dec 8, 2025Updated 3 months ago
- Offical implementation of "MetaLA: Unified Optimal Linear Approximation to Softmax Attention Map" (NeurIPS2024 Oral)☆35Jan 18, 2025Updated last year
- ☆21Mar 1, 2022Updated 4 years ago
- Open Source Road Datasets☆18Aug 30, 2024Updated last year
- UGRoadUpd: An unchanged-guided road updating framework based on remotely sensed imagery☆12Mar 15, 2023Updated 3 years ago
- 可以成功Lora微调的Qwen-VL模型☆16Oct 27, 2023Updated 2 years ago
- Code and Model of the SwinTD_Net for Single Image Dehazing☆11Jul 21, 2023Updated 2 years ago
- Implementation of KDR-Agent, the AAAI 2025 accepted paper, focusing on knowledge-driven reasoning for autonomous agents.☆18Nov 24, 2025Updated 4 months ago
- ☆21Nov 5, 2024Updated last year
- chatglm多gpu用deepspeed和☆408Jul 8, 2024Updated last year
- 最基本最小白的自然语言处理入门读物,基于deepseek-r1,涵盖了传统NLP和现代大模型☆23Jan 16, 2026Updated 2 months ago
- ☆32Feb 2, 2026Updated last month
- Official code of the paper "Synthetic Instance Segmentation from Semantic Image Segmentation Masks"☆21Oct 31, 2023Updated 2 years ago
- 《大语言模型》作者:赵鑫,李军毅,周昆,唐天一,文继荣☆4,390Sep 2, 2025Updated 6 months ago
- Official release of FacialFlowNet: Advancing Facial Optical Flow Estimation with a Diverse Dataset and a Decomposed Model (ACMMM2024)☆26Nov 11, 2024Updated last year
- ☆57Mar 14, 2026Updated last week
- 😎 Awesome lists of papers and codes about open-vocabulary perception, including both 3D and 2D☆65Jul 27, 2025Updated 7 months ago
- ☆29Dec 10, 2021Updated 4 years ago
- CSANet: Cross-Temporal Interaction Symmetric Attention Network for Hyperspectral Image Change Detection☆12Sep 13, 2022Updated 3 years ago