Empirical Study Towards Building An Effective Multi-Modal Large Language Model
☆22Oct 25, 2023Updated 2 years ago
Alternatives and similar repositories for Skywork-MM
Users that are interested in Skywork-MM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Partially Non-Autoregressive Image Captioning☆10Sep 30, 2021Updated 4 years ago
- CaMEL: Mean Teacher Learning for Image Captioning. ICPR 2022☆30Dec 1, 2022Updated 3 years ago
- [TMLR 2024] Official implementation of "Sight Beyond Text: Multi-Modal Training Enhances LLMs in Truthfulness and Ethics"☆20Sep 15, 2023Updated 2 years ago
- Developer project for getting basic API integrations working in under 5 minutes☆11Jan 30, 2026Updated 3 months ago
- [ACL 2023] Delving into the Openness of CLIP☆24Jan 11, 2023Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Video Diffusion State Space Models☆19Mar 27, 2024Updated 2 years ago
- Dynamic Early Exit for Image Captioning☆17Oct 25, 2022Updated 3 years ago
- ☆24Oct 8, 2023Updated 2 years ago
- Blending Custom Photos with Video Diffusion Transformers☆50Jan 21, 2025Updated last year
- Generate consistent videos with stable diffusion models☆51Jan 20, 2023Updated 3 years ago
- Multimodal chatbot with computer vision capabilities integrated, our 1st-gen LMM☆101May 17, 2024Updated last year
- [ICLR 2026] Computer Agent Arena: Toward Human-Centric Evaluation and Analysis of Computer-Use Agents☆59Feb 26, 2026Updated 2 months ago
- character recognition, textline recognition☆10Aug 31, 2019Updated 6 years ago
- ☆29Mar 24, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- [NeurIPS2023] Official implementation of the paper "Large Language Models are Visual Reasoning Coordinators"☆106Nov 9, 2023Updated 2 years ago
- music generation with perceiver-ar model☆26Jul 20, 2022Updated 3 years ago
- TaiYiXLCheckpointLoader: An unoffical node support Taiyi-Diffusion-XL(Taiyi-XL) Chinese-English bilingual language model☆11Sep 1, 2024Updated last year
- ☆26Jun 25, 2021Updated 4 years ago
- Anything Model Bacth Downloader allows you to batch download models from civitai, hugging face easily just through model url.☆14Mar 19, 2023Updated 3 years ago
- A Data Source for Reasoning Embodied Agents☆19Sep 18, 2023Updated 2 years ago
- Codes for ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding [ICML 2025]]☆48Jul 22, 2025Updated 9 months ago
- ☆30Feb 16, 2024Updated 2 years ago
- python library for reverse engineered Adobe Firefly API☆13Mar 31, 2023Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆12Nov 8, 2019Updated 6 years ago
- This repository is home to a Unity project with 36 different shaders and 6 different particle systems to be tested all in the same scene …☆18Apr 15, 2024Updated 2 years ago
- [ACL 2024] PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chain☆107Mar 14, 2024Updated 2 years ago
- Our 2nd-gen LMM☆34May 22, 2024Updated last year
- Cyberdolphin Suite of ComfyUI nodes for wiring up OpenAI and compatible LLM APIs.☆15Jul 31, 2024Updated last year
- The official GitHub page for ''What Makes for Good Visual Instructions? Synthesizing Complex Visual Reasoning Instructions for Visual Ins…☆19Nov 10, 2023Updated 2 years ago
- [NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623☆118Sep 26, 2024Updated last year
- ☆13May 11, 2022Updated 3 years ago
- Web page for "🍅HumanTOMATO: Text-aligned Whole-body Motion Generation".☆15May 25, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆13Jul 10, 2024Updated last year
- Text-based real image editing with stable diffusion models☆27Dec 19, 2022Updated 3 years ago
- Retrieval-style In-Context Learning for Few-shot Hierarchical Text Classification☆17Jul 13, 2025Updated 9 months ago
- A comprehensive overview of Data Distillation and Condensation (DDC). DDC is a data-centric task where a representative (i.e., small but …☆13Dec 1, 2022Updated 3 years ago
- VisualToolAgent (VisTA): A Reinforcement Learning Framework for Visual Tool Selection☆26May 31, 2025Updated 11 months ago
- Implementation about a recommender System using RQ-VAE Semantic IDs☆16Apr 15, 2026Updated 2 weeks ago
- Building a multi-agent RAG system with advanced RAG methods☆12Jan 12, 2025Updated last year