axzml / ChatSD
ChatSD is designed to make image generation tasks easily
☆20Updated last year
Related projects ⓘ
Alternatives and complementary repositories for ChatSD
- TagGPT: Large Language Models are Zero-shot Multimodal Taggers☆61Updated last year
- Chinese CLIP models with SOTA performance.☆48Updated last year
- ☆66Updated last year
- A fine tune version of Stable Diffusion model on self-translate 10k diffusiondb Chinese Corpus and "extend" it☆31Updated last year
- 基于baichuan-7b的开源多模态大语言模型☆72Updated 11 months ago
- A light proxy solution for HuggingFace hub.☆44Updated last year
- The simple demo of `Unified Vision-Language Representation Modeling for E-Commerce Same-Style Products Retrieval`☆11Updated last year
- Empirical Study Towards Building An Effective Multi-Modal Large Language Model☆23Updated last year
- Repo for Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent☆56Updated this week
- SkyScript-100M: 1,000,000,000 Pairs of Scripts and Shooting Scripts for Short Drama: https://arxiv.org/abs/2408.09333v2☆99Updated this week
- ☆30Updated 6 months ago
- ☆17Updated last year
- 本项目使用LLaVA 1.6多模态模型实现以文搜图和以图搜图功能。☆17Updated 8 months ago
- WuDaoMM this is a data project☆66Updated 2 years ago
- ☆24Updated last year
- CLIP中文encoder☆21Updated 2 years ago
- A Simple MLLM Surpassed QwenVL-Max with OpenSource Data Only in 14B LLM.☆36Updated 2 months ago
- ☆77Updated 6 months ago
- code for paper 《RankingGPT: Empowering Large Language Models in Text Ranking with Progressive Enhancement》☆29Updated 10 months ago
- official code for paper: Exploring Domain Incremental Video Highlights Detection with the LiveFood Benchmark☆31Updated 10 months ago
- Touchstone: Evaluating Vision-Language Models by Language Models☆78Updated 10 months ago
- Inference speed-up for stable-diffusion (ldm) with TensorRT.☆35Updated last year
- Source code for EMNLP2022 long paper: Parameter-Efficient Tuning Makes a Good Classification Head☆13Updated 2 years ago
- A fully differentiable architecture search for GANs☆17Updated 3 years ago
- SuperCLUE-Math6:新一代中文原生多轮多步数学推理数据集的探索之旅☆46Updated 9 months ago
- ☆31Updated last week
- Ongoing research training transformer language models at scale, including: BERT & GPT-2☆19Updated last year
- [CVPR 2024] DiffAgent: Fast and Accurate Text-to-Image API Selection with Large Language Model☆16Updated 7 months ago
- 中文原生多层次文生视频测评基准☆17Updated 4 months ago