《多模态大模型:新一代人工智能技术范式》配套教学资源
☆275Apr 30, 2026Updated this week
Alternatives and similar repositories for Book-of-MLM
Users that are interested in Book-of-MLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [Embodied-AI-Survey-2025] Paper List and Resource Repository for Embodied AI☆2,025Apr 16, 2026Updated 2 weeks ago
- Embodied Question Answering (EQA) benchmark and method (ICCV 2025)☆49Aug 12, 2025Updated 8 months ago
- 大型语言模型实战指南:应用实践与场景落地☆88Sep 13, 2024Updated last year
- VisionGRU: A Linear-Complexity RNN Model for Efficient Image Analysis☆13Dec 26, 2024Updated last year
- 《基于BERT模型的自然语言处理实战》随书代码☆17Jun 13, 2022Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Transferable Feature Representation for Visible-to-Infrared Cross-Dataset Human Action Recognition (Complexity 2018)☆13Dec 14, 2022Updated 3 years ago
- The official implementation of "Cross-modal Causal Relation Alignment for Video Question Grounding. (CVPR 2025 Highlight)"☆50Apr 27, 2025Updated last year
- [IEEE T-CSVT 2019] Hierarchically Learned View-Invariant Representations for Cross-View Action Recognition☆14Nov 26, 2019Updated 6 years ago
- 通用简单工具项目☆22Oct 6, 2024Updated last year
- The collections of MOE (Mixture Of Expert) papers, code and tools, etc.☆12Mar 15, 2024Updated 2 years ago
- ☆85Apr 7, 2026Updated 3 weeks ago
- 2025.01:从零到一实现了一个多模态大模型,并命名为Reyes( 睿视),R:睿,eyes:眼。Reyes的参数量为8B,视觉编码器使用的是InternViT-300M-448px-V2_5,语言模型侧使用的是Qwen2.5-7B-Instruct,Reyes也通过一个两…☆33Feb 10, 2026Updated 2 months ago
- 🧑🚀 全世界最好的LLM资料总结(多模态生成、Agent、辅助编程、AI审稿、数据处理、模型训练、模型推理、o1 模型、MCP、小语言模型、视觉语言模型) | Summary of the world's best LLM resources.☆8,195Updated this week
- Official implementation of paper "OED: Towards One-stage End-to-End Dynamic Scene Graph Generation".☆29Mar 26, 2024Updated 2 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- KMM: Key Frame Mask Mamba for Extended Motion Generation☆19Sep 22, 2025Updated 7 months ago
- [IEEE T-IP 2021] Semantics-aware Adaptive Knowledge Distillation for Cross-modal Action Recognition☆29Jan 6, 2025Updated last year
- 主要记录大语言大模型(LLMs) 算法(应用)工程师多模态相关知识☆278May 12, 2024Updated last year
- Implementation for What it Thinks is Important is Important: Robustness Transfers through Input Gradients (CVPR 2020 Oral)☆16Mar 24, 2023Updated 3 years ago
- ☆21Mar 1, 2022Updated 4 years ago
- Offical implementation of "MetaLA: Unified Optimal Linear Approximation to Softmax Attention Map" (NeurIPS2024 Oral)☆36Jan 18, 2025Updated last year
- ☆24Apr 16, 2022Updated 4 years ago
- UGRoadUpd: An unchanged-guided road updating framework based on remotely sensed imagery☆12Mar 15, 2023Updated 3 years ago
- 可以成功Lora微调的Qwen-VL模型☆16Oct 27, 2023Updated 2 years ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- Implementation for "Change Event Dataset for Discovery from Spatio-temporal Remote Sensing Imagery"☆17Aug 9, 2022Updated 3 years ago
- Deep Correlated Prompting for Visual Recognition with Missing Modalities (NeurIPS 2024)☆36Mar 6, 2025Updated last year
- chatglm多gpu用deepspeed和☆408Jul 8, 2024Updated last year
- [ICCV 2023] Official repository of paper titled "Why Is Prompt Tuning for Vision-Language Models Robust to Noisy Labels?"☆27Sep 20, 2023Updated 2 years ago
- Official release of FacialFlowNet: Advancing Facial Optical Flow Estimation with a Diverse Dataset and a Decomposed Model (ACMMM2024)☆26Nov 11, 2024Updated last year
- [AAAI 2024] M3SOT: Multi-frame, Multi-field, Multi-space 3D Single Object Tracking☆16Apr 29, 2024Updated 2 years ago
- Llama3-Tutorial(XTuner、LMDeploy、OpenCompass)☆509May 10, 2024Updated last year
- ☆39Aug 26, 2025Updated 8 months ago
- 最基本最小白的自然语言处理入门读物,基于deepseek-r1,涵盖了传统NLP和现代大模型☆27Jan 16, 2026Updated 3 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- 😎 Awesome lists of papers and codes about open-vocabulary perception, including both 3D and 2D☆65Jul 27, 2025Updated 9 months ago
- ☆29Dec 10, 2021Updated 4 years ago
- 本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)☆24,178Mar 12, 2026Updated last month
- ☆19Jul 23, 2024Updated last year
- [RAL 2024] Triplet-Graph: Global Metric Localization Based on Semantic Triplet Graph for Autonomous Vehicles☆10Mar 23, 2024Updated 2 years ago
- ☆10Aug 9, 2018Updated 7 years ago
- [EMNLP 2025 main] C3 Benchmark: A Bilingual Benchmark for Spoken Dialogue Models Exploring Challenges in Complex Conversations☆30Dec 24, 2025Updated 4 months ago