LEO: A powerful Hybrid Multimodal LLM
☆20Jan 18, 2025Updated last year
Alternatives and similar repositories for LEO
Users that are interested in LEO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆10Apr 7, 2025Updated last year
- ☆13Mar 28, 2025Updated last year
- Visual Spatial Tuning☆198Mar 25, 2026Updated last month
- ☆22Jul 11, 2025Updated 9 months ago
- More reliable Video Understanding Evaluation☆15Sep 23, 2025Updated 7 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Evaluation results for Machine Translation within the BigScience project☆11May 15, 2023Updated 2 years ago
- Efficient Visual Question Answering for Autonomous Vehicles with Reasoning-Enhanced Small Vision-Language Models☆23Apr 16, 2025Updated last year
- [ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model☆21Jul 20, 2024Updated last year
- ☆25Jan 15, 2025Updated last year
- LLMBind: A Unified Modality-Task Integration Framework☆19Jun 16, 2024Updated last year
- [NeurIPS25] Official Implementation (Pytorch) of "DeepVideo-R1"☆33Feb 22, 2026Updated 2 months ago
- Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment☆65Jul 22, 2025Updated 9 months ago
- Code and data for the paper: Learning Action and Reasoning-Centric Image Editing from Videos and Simulation☆35Jun 30, 2025Updated 10 months ago
- List of papers on Hallucination in LMM☆10Nov 29, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- [CVPR'24] Code for Emergent Open-Vocabulary Semantic Segmentation from Off-the-shelf Vision-Language Models☆18Jul 22, 2024Updated last year
- [CVPR 2025] LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal Understanding☆84Jul 4, 2025Updated 10 months ago
- [ACL 2025] RADAR: Enhancing Radiology Report Generation with Supplementary Knowledge Injection☆34Jul 23, 2025Updated 9 months ago
- the pytorch implementation of SubCenterArcface and sphereface2. And i add the prove of easy_margin part of Arcface in the codes.☆12Dec 1, 2021Updated 4 years ago
- ☆15Jan 24, 2018Updated 8 years ago
- Code Repo for the ACL21 paper "Common Sense Beyond English: Evaluating and Improving Multilingual LMs for Commonsense Reasoning"☆23Oct 26, 2021Updated 4 years ago
- Auto1111 port of NVlab's adversarial purification method that uses the forward and reverse processes of diffusion models to remove advers…☆13Aug 8, 2023Updated 2 years ago
- ☆15Feb 24, 2023Updated 3 years ago
- Provides a selection of 12 logic gates that you can interconnect with patch cables to make a variety of different logic circuits.☆11Feb 28, 2026Updated 2 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- The paper list of multilingual pre-trained models (Continual Updated).☆24Jun 18, 2024Updated last year
- Offical implementation of "Re-Aligning Language to Visual Objects with an Agentic Workflow"☆32Apr 20, 2025Updated last year
- Official Implementation of CVPR 2022 paper: "Mimicking the Oracle: An Initial Phase Decorrelation Approach for Class Incremental Learning…☆35Feb 10, 2023Updated 3 years ago
- [CVPR 2026] Scaling Spatial Intelligence with Multimodal Foundation Models☆231Apr 29, 2026Updated last week
- Open-vocabulary Semantic Segmentation☆33Feb 16, 2024Updated 2 years ago
- ☆33Sep 27, 2024Updated last year
- Code for DeCo: Decoupling token compression from semanchc abstraction in multimodal large language models☆78Jul 14, 2025Updated 9 months ago
- [ICML 2025] Official Github Repo for WOMD-Reasoning Dataset☆45Nov 27, 2025Updated 5 months ago
- 学生作业上传、预览、打分系统☆11Jul 18, 2016Updated 9 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- (AAAI 24) Step Vulnerability Guided Mean Fluctuation Adversarial Attack against Conditional Diffusion Models☆11Oct 12, 2024Updated last year
- ☆21Nov 14, 2025Updated 5 months ago
- The official implementation of "Cross-modal Causal Relation Alignment for Video Question Grounding. (CVPR 2025 Highlight)"☆50Apr 27, 2025Updated last year
- Applies ROME and MEMIT on Mamba-S4 models☆15Apr 5, 2024Updated 2 years ago
- ☆38Jun 20, 2025Updated 10 months ago
- Initial code for computer vision experiments☆11Jan 1, 2023Updated 3 years ago
- ☆108Dec 27, 2024Updated last year