LEO: A powerful Hybrid Multimodal LLM
☆20Jan 18, 2025Updated last year
Alternatives and similar repositories for LEO
Users that are interested in LEO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆13Mar 28, 2025Updated last year
- Visual Spatial Tuning☆198Mar 25, 2026Updated 2 months ago
- ☆23Jul 11, 2025Updated 10 months ago
- Implementation of Pix2Seq in PyTorch☆10Feb 3, 2022Updated 4 years ago
- The code for paper entitled "Data-Driven Modulation Optimization with LMMSE Equalization for Reliability Enhancement in Underwater Acoust…☆19Apr 9, 2026Updated last month
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Efficient Visual Question Answering for Autonomous Vehicles with Reasoning-Enhanced Small Vision-Language Models☆25Apr 16, 2025Updated last year
- [ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model☆22Jul 20, 2024Updated last year
- [CVPR 2025] Official code of "DiET-GS: Diffusion Prior and Event Stream-Assisted Motion Deblurring 3D Gaussian Splatting"☆55Sep 5, 2025Updated 8 months ago
- LLMBind: A Unified Modality-Task Integration Framework☆19Jun 16, 2024Updated last year
- Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment☆65Jul 22, 2025Updated 10 months ago
- Code and data for the paper: Learning Action and Reasoning-Centric Image Editing from Videos and Simulation☆35Jun 30, 2025Updated 10 months ago
- List of papers on Hallucination in LMM☆10Nov 29, 2023Updated 2 years ago
- [CVPR'24] Code for Emergent Open-Vocabulary Semantic Segmentation from Off-the-shelf Vision-Language Models☆18Jul 22, 2024Updated last year
- [CVPR 2025] LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal Understanding☆84Jul 4, 2025Updated 10 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- [ACL 2025] RADAR: Enhancing Radiology Report Generation with Supplementary Knowledge Injection☆34Jul 23, 2025Updated 10 months ago
- the pytorch implementation of SubCenterArcface and sphereface2. And i add the prove of easy_margin part of Arcface in the codes.☆12Dec 1, 2021Updated 4 years ago
- 在线学习网站 教师端+学生端 (课件资源上传下载删除、教学团队、班级管理、学生管理、考勤、作业提交批改评分、讨论区、找回密码)☆11Feb 16, 2022Updated 4 years ago
- ☆16Jan 24, 2018Updated 8 years ago
- Auto1111 port of NVlab's adversarial purification method that uses the forward and reverse processes of diffusion models to remove advers…☆13Aug 8, 2023Updated 2 years ago
- Offical implementation of "Re-Aligning Language to Visual Objects with an Agentic Workflow"☆32Apr 20, 2025Updated last year
- Official Implementation of CVPR 2022 paper: "Mimicking the Oracle: An Initial Phase Decorrelation Approach for Class Incremental Learning…☆35Feb 10, 2023Updated 3 years ago
- [NeurIPS'25] Backdoor Cleaning without External Guidance in MLLM Fine-tuning☆20Oct 13, 2025Updated 7 months ago
- Open-vocabulary Semantic Segmentation☆33Feb 16, 2024Updated 2 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆33Sep 27, 2024Updated last year
- Code for DeCo: Decoupling token compression from semanchc abstraction in multimodal large language models☆78Jul 14, 2025Updated 10 months ago
- [ICML 2025] Official Github Repo for WOMD-Reasoning Dataset☆45Nov 27, 2025Updated 5 months ago
- 🐧 Unify-Agent: An end-to-end unified multimodal agent for faithful, knowledge-grounded image generation.☆79May 2, 2026Updated 3 weeks ago
- 学生作业上传、预览、打分系统☆11Jul 18, 2016Updated 9 years ago
- 给科研小白的一些资源与工具推荐☆17Jul 6, 2020Updated 5 years ago
- ☆22Nov 14, 2025Updated 6 months ago
- The official implementation of "Cross-modal Causal Relation Alignment for Video Question Grounding. (CVPR 2025 Highlight)"☆50Apr 27, 2025Updated last year
- Applies ROME and MEMIT on Mamba-S4 models☆15Apr 5, 2024Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆38Jun 20, 2025Updated 11 months ago
- Initial code for computer vision experiments☆11Jan 1, 2023Updated 3 years ago
- ☆109Dec 27, 2024Updated last year
- [AAAI 2025] Official Implementation of 3D$^2$-Actor: Learning Pose-Conditioned 3D-Aware Denoiser for Realistic Gaussian Avatar Modeling☆15Mar 30, 2025Updated last year
- Hardware and firmware for a USB connected relay box☆16Mar 26, 2024Updated 2 years ago
- Official eval code for ROVER: Benchmarking Reciprocal Cross-Modal Reasoning for Omnimodal Generation☆26Dec 12, 2025Updated 5 months ago
- [ECCV 2024] API: Attention Prompting on Image for Large Vision-Language Models☆112Oct 10, 2024Updated last year