km1994 / AwesomeMultiModelLinks
【AIGC 实战入门笔记 —— AIGC 摩天大楼】分享 大语言模型(LLMs),大模型高效微调(SFT),检索增强生成(RAG),智能体(Agent),PPT自动生成, 角色扮演,文生图(Stable Diffusion) ,图像文字识别(OCR),语音识别(ASR),语音合成(TTS),人像分割(SA),多模态(VLM),Ai 换脸(Face Swapping), 文生视频(VD),图生视频(SVD),Ai 动作迁移,Ai 虚拟试衣,数字人,全模态理解(Omni),Ai音乐生成 干货学习 等 实战与经验。
☆17Updated 2 months ago
Alternatives and similar repositories for AwesomeMultiModel
Users that are interested in AwesomeMultiModel are comparing it to the libraries listed below
Sorting:
- Chinese CLIP models with SOTA performance.☆55Updated last year
- Modify-Anything is based on yolov5,yolov8 for video and image detection. Segment-anything,lama_cleaner is applied to segment, modify, era…☆15Updated 2 years ago
- A multimodal large-scale model, which performs close to the closed-source Qwen-VL-PLUS on many datasets and significantly surpasses the p…☆14Updated last year
- 中文原生文生图测评基准☆9Updated last year
- Non-local Modeling for Image Quality Assessment☆13Updated last year
- [FGVC9-CVPR 2022] The second place solution for 2nd eBay eProduct Visual Search Challenge.☆26Updated 2 years ago
- ☆15Updated 9 months ago
- Task Agnostic Unsupervised Learning☆15Updated 3 years ago
- Facebook Image Similarity Challenge 2021☆19Updated 3 years ago
- Code for the Video Similarity Challenge.☆81Updated last year
- Various test models in WNNX format. It can view with `pip install wnetron && wnetron`☆12Updated 3 years ago
- ☆22Updated 3 years ago
- CLIP中文encoder☆22Updated 3 years ago
- Code for paper <Explain Me the Painting: Multi-Topic Knowledgeable Art Description Generation> in ICCV 2021.☆13Updated 3 years ago
- ☆28Updated 3 years ago
- [CVPR Challenge Rank 2nd] The codes and related files to reproduce the results for Video Similarity Challenge Descriptor Track.☆19Updated 3 months ago
- A fine tune version of Stable Diffusion model on self-translate 10k diffusiondb Chinese Corpus and "extend" it☆31Updated 2 years ago
- An interactive demo based on Segment-Anything for stroke-based painting which enables human-like painting.☆35Updated 2 years ago
- [PR 2024] A large Cross-Modal Video Retrieval Dataset with Reading Comprehension☆26Updated last year
- 微信公众号:机器感知 | Tracking the Latest Arxiv Papers☆38Updated last month
- Chinese Stable Diffusion, zh SD,中文文生图,中文SD,中文Stable Diffusion☆49Updated last year
- 国内外数据竞赛资讯整理☆18Updated 3 years ago
- A PyTorch implementation of Proxy Anchor Loss based on CVPR 2020 paper "Proxy Anchor Loss for Deep Metric Learning"☆10Updated 4 years ago
- LLaVA combines with Magvit Image tokenizer, training MLLM without an Vision Encoder. Unifying image understanding and generation.☆37Updated last year
- ☆15Updated 6 months ago
- 集成了LLM与SDXL的AIGC应用程序☆29Updated last year
- a tool for gerenate dataset from doc☆12Updated 3 months ago
- A fast method for real face morphing (一个可以快速部署实现的人脸变形方法)☆11Updated 3 years ago
- ☆11Updated 4 years ago
- A Simple MLLM Surpassed QwenVL-Max with OpenSource Data Only in 14B LLM.☆37Updated 10 months ago