axzml / ChatSD
ChatSD is designed to make image generation tasks easily
☆20Updated last year
Alternatives and similar repositories for ChatSD:
Users that are interested in ChatSD are comparing it to the libraries listed below
- ☆32Updated 7 months ago
- Chinese CLIP models with SOTA performance.☆51Updated last year
- ☆67Updated last year
- TagGPT: Large Language Models are Zero-shot Multimodal Taggers☆61Updated last year
- WuDaoMM this is a data project☆69Updated 2 years ago
- ☆24Updated last year
- A Simple MLLM Surpassed QwenVL-Max with OpenSource Data Only in 14B LLM.☆36Updated 4 months ago
- Empirical Study Towards Building An Effective Multi-Modal Large Language Model☆23Updated last year
- A fine tune version of Stable Diffusion model on self-translate 10k diffusiondb Chinese Corpus and "extend" it☆31Updated last year
- The simple demo of `Unified Vision-Language Representation Modeling for E-Commerce Same-Style Products Retrieval`☆13Updated last month
- 基于baichuan-7b的开源多模态大语言模型☆73Updated last year
- Lion and Adam optimization comparison☆56Updated last year
- Adapted from https://note.com/kohya_ss/n/nbf7ce8d80f29 for easier cloning☆28Updated last year
- Taiyi-Diffusion-XL训练代码☆21Updated 7 months ago
- 中文原生多层次文生视频测评基准☆17Updated 6 months ago
- A light proxy solution for HuggingFace hub.☆46Updated last year
- Inference speed-up for stable-diffusion (ldm) with TensorRT.☆35Updated last year
- SkyScript-100M: 1,000,000,000 Pairs of Scripts and Shooting Scripts for Short Drama: https://arxiv.org/abs/2408.09333v2☆109Updated 2 months ago
- Converting Mixtral-8x7B to Mixtral-[1~7]x7B☆20Updated 10 months ago
- 本项目使用LLaVA 1.6多模态模型实现以文搜图和以图搜图功能。☆19Updated 10 months ago
- Generative Recommendation: Towards Next-generation Recommender Paradigm☆57Updated last year
- ☆32Updated 2 years ago
- official code for paper: Exploring Domain Incremental Video Highlights Detection with the LiveFood Benchmark☆34Updated last year
- code for paper 《RankingGPT: Empowering Large Language Models in Text Ranking with Progressive Enhancement》☆31Updated last year
- ⚡FlashRAG: A Python Toolkit for Efficient RAG Research☆22Updated last month
- CLIP中文encoder☆22Updated 2 years ago
- Large Multimodal Model☆14Updated 9 months ago
- ☆65Updated last year
- the world's first large-scale multi-modal short-video encyclopedia, where the primitive units are items, aspects, and short videos.☆60Updated last year