Empirical Study Towards Building An Effective Multi-Modal Large Language Model
☆22Oct 25, 2023Updated 2 years ago
Alternatives and similar repositories for Skywork-MM
Users that are interested in Skywork-MM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Partially Non-Autoregressive Image Captioning☆10Sep 30, 2021Updated 4 years ago
- CaMEL: Mean Teacher Learning for Image Captioning. ICPR 2022☆29Dec 1, 2022Updated 3 years ago
- [TMLR 2024] Official implementation of "Sight Beyond Text: Multi-Modal Training Enhances LLMs in Truthfulness and Ethics"☆20Sep 15, 2023Updated 2 years ago
- [ACL 2023] Delving into the Openness of CLIP☆24Jan 11, 2023Updated 3 years ago
- Video Diffusion State Space Models☆19Mar 27, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- Blending Custom Photos with Video Diffusion Transformers☆48Jan 21, 2025Updated last year
- Multimodal chatbot with computer vision capabilities integrated, our 1st-gen LMM☆101May 17, 2024Updated last year
- [ICLR 2026] Computer Agent Arena: Toward Human-Centric Evaluation and Analysis of Computer-Use Agents☆59Feb 26, 2026Updated last month
- Diffusion-Sharpening: Fine-tuning Diffusion Models with Denoising Trajectory Sharpening☆70May 18, 2025Updated 10 months ago
- ☆29Mar 24, 2025Updated last year
- [NeurIPS 2024] Needle In A Multimodal Haystack (MM-NIAH): A comprehensive benchmark designed to systematically evaluate the capability of…☆124Nov 25, 2024Updated last year
- [NeurIPS2023] Official implementation of the paper "Large Language Models are Visual Reasoning Coordinators"☆105Nov 9, 2023Updated 2 years ago
- A library for fast, distributed clustering☆16Feb 16, 2010Updated 16 years ago
- TaiYiXLCheckpointLoader: An unoffical node support Taiyi-Diffusion-XL(Taiyi-XL) Chinese-English bilingual language model☆11Sep 1, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆26Jun 25, 2021Updated 4 years ago
- Code for "SCL-RAI: Span-based Contrastive Learning with Retrieval Augmented Inference for Unlabeled Entity Problem in NER" @COLING-2022☆11Aug 20, 2022Updated 3 years ago
- Anything Model Bacth Downloader allows you to batch download models from civitai, hugging face easily just through model url.☆15Mar 19, 2023Updated 3 years ago
- A Data Source for Reasoning Embodied Agents☆19Sep 18, 2023Updated 2 years ago
- Codes for ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding [ICML 2025]]☆48Jul 22, 2025Updated 8 months ago
- Unity 3D Code for "Building Tilt Brush from Scratch" YouTube tutorial by Fuseman☆11Mar 1, 2017Updated 9 years ago
- ☆28Jan 6, 2026Updated 3 months ago
- ☆30Feb 16, 2024Updated 2 years ago
- 2022 WAIC 黑客松蚂蚁财富赛道:AntSQL大规模金融语义解析中文Text-to-SQL挑战赛 一位萌新的代码 嘻嘻嘻☆13Mar 11, 2023Updated 3 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆13Nov 8, 2019Updated 6 years ago
- This repository is home to a Unity project with 36 different shaders and 6 different particle systems to be tested all in the same scene …☆18Apr 15, 2024Updated last year
- [ACL 2024] PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chain☆107Mar 14, 2024Updated 2 years ago
- Our 2nd-gen LMM☆34May 22, 2024Updated last year
- The official GitHub page for ''What Makes for Good Visual Instructions? Synthesizing Complex Visual Reasoning Instructions for Visual Ins…☆19Nov 10, 2023Updated 2 years ago
- [NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623☆117Sep 26, 2024Updated last year
- General-purpose Visual Understanding Evaluation☆20Dec 21, 2023Updated 2 years ago
- Train toy models using multi-token prediction objective☆14May 8, 2024Updated last year
- ☆13May 11, 2022Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- [2024-ACL]: TextBind: Multi-turn Interleaved Multimodal Instruction-following in the Wildrounded Conversation☆47Sep 19, 2023Updated 2 years ago
- An Extension for Automatic1111 Webui that makes the interface easier to use on mobile (portrait)☆16Apr 16, 2024Updated last year
- Web page for "🍅HumanTOMATO: Text-aligned Whole-body Motion Generation".☆15May 25, 2024Updated last year
- HSTU-BLaIR: Lightweight Contrastive Text Embedding for Generative Recommender 🌱☆25Jul 4, 2025Updated 9 months ago
- ☆13Jul 10, 2024Updated last year
- A simple exam generator and grader written in Python with OpenCV☆14Jan 14, 2026Updated 2 months ago
- Created for this model trained by Gustavosta for Stable Diffusion to create a prompt from a few words. You can submit your own text or se…☆17Feb 13, 2023Updated 3 years ago