km1994 / AwesomeMultiModelLinks
【AIGC 实战入门笔记 —— AIGC 摩天大楼】分享 大语言模型(LLMs),大模型高效微调(SFT),检索增强生成(RAG),智能体(Agent),PPT自动生成, 角色扮演,文生图(Stable Diffusion) ,图像文字识别(OCR),语音识别(ASR),语音合成(TTS),人像分割(SA),多模态(VLM),Ai 换脸(Face Swapping), 文生视频(VD),图生视频(SVD),Ai 动作迁移,Ai 虚拟试衣,数字人,全模态理解(Omni),Ai音乐生成 干货学习 等 实战与经验。
☆38Updated 5 months ago
Alternatives and similar repositories for AwesomeMultiModel
Users that are interested in AwesomeMultiModel are comparing it to the libraries listed below
Sorting:
- Chinese CLIP models with SOTA performance.☆58Updated 2 years ago
- An efficient multi-modal instruction-following data synthesis tool and the official implementation of Oasis https://arxiv.org/abs/2503.08…☆31Updated 4 months ago
- 本项目使用LLaVA 1.6多模态模型实现以文搜图和以图搜图功能。☆25Updated last year
- ☆16Updated last year
- A Simple MLLM Surpassed QwenVL-Max with OpenSource Data Only in 14B LLM.☆38Updated last year
- CLIP中文encoder☆22Updated 3 years ago
- Our 2nd-gen LMM☆34Updated last year
- Toward Universal Multimodal Embedding☆62Updated 2 months ago
- ☆57Updated last year
- [PR 2024] A large Cross-Modal Video Retrieval Dataset with Reading Comprehension☆28Updated last year
- Repository for 23'MM accepted paper "Curriculum-Listener: Consistency- and Complementarity-Aware Audio-Enhanced Temporal Sentence Groundi…☆51Updated last year
- ☆72Updated 2 years ago
- ☆79Updated last year
- Precision Search through Multi-Style Inputs☆73Updated 2 months ago
- Facebook Image Similarity Challenge 2021☆19Updated 3 years ago
- ☆17Updated 2 years ago
- Multimodal Open-O1 (MO1) is designed to enhance the accuracy of inference models by utilizing a novel prompt-based approach. This tool wo…☆29Updated last year
- Build a simple basic multimodal large model from scratch. 从零搭建一个简单的基础多模态大模型🤖☆47Updated last year
- Tensorflow implementation for Dash☆32Updated 3 years ago
- A multimodal large-scale model, which performs close to the closed-source Qwen-VL-PLUS on many datasets and significantly surpasses the p…☆14Updated last year
- survery of small language models☆16Updated last year
- Taiyi-Diffusion-XL训练代码☆23Updated last year
- ☆32Updated 3 years ago
- 2019 CCF 大数据与计算智能大赛 视频版权检测算法 复赛第8名方案 | 8th place solution of Video Copyright Detection Algorithm Track, 2019 CCF Big Data & Computing Int…☆30Updated 5 years ago
- 2019CCF爱奇艺视频拷贝(版权)检测算法☆15Updated 5 years ago
- Code for the Video Similarity Challenge.☆80Updated last year
- [ICCV2025] A Token-level Text Image Foundation Model for Document Understanding☆121Updated last month
- 中文CLIP:自定义数据集,可根据文图提取向量,实现文图匹配。☆22Updated 3 years ago
- Video dataset dedicated to portrait-mode video recognition.☆52Updated this week
- LAVIS - A One-stop Library for Language-Vision Intelligence☆10Updated 2 years ago