Multimodal Open-O1 (MO1) is designed to enhance the accuracy of inference models by utilizing a novel prompt-based approach. This tool works locally and aims to create inference chains akin to those used by OpenAI-o1, but with localized processing power.
☆28Sep 25, 2024Updated last year
Alternatives and similar repositories for Multimodal-Open-O1
Users that are interested in Multimodal-Open-O1 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- RelayGS: Reconstructing Dynamic Scenes with Large-Scale and Complex Motions via Relay Gaussians☆14Dec 5, 2024Updated last year
- [IJCV 2025] OmniDrag: Enabling Motion Control for Omnidirectional Image-to-Video Generation☆15Feb 13, 2026Updated 3 months ago
- [ICLR 2025] Permute-and-Flip: An optimally robust and watermarkable decoder for LLMs☆19Mar 20, 2025Updated last year
- [ACL2026] Uni-MMMU : A Massive Multi-discipline Multimodal Unified Benchmark☆25Apr 13, 2026Updated last month
- A multimodal large-scale model, which performs close to the closed-source Qwen-VL-PLUS on many datasets and significantly surpasses the p…☆14Feb 5, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Official implementation of Geometry Cloak [NeurIPS'24]☆24Apr 16, 2025Updated last year
- A lightweight flexible Video-MLLM developed by TencentQQ Multimedia Research Team.☆74Oct 14, 2024Updated last year
- Code Repository for the NeurIPS 2024 Paper "Toward Efficient Inference for Mixture of Experts".☆19Oct 30, 2024Updated last year
- Official implementation for "Pruning Large Language Models with Semi-Structural Adaptive Sparse Training" (AAAI 2025)☆19Jul 1, 2025Updated 10 months ago
- Official implementation of NeRFProtector [ECCV'24]☆22Aug 27, 2024Updated last year
- [CVPR2025] Code Release for "FlexGS: Train Once, Deploy Everywhere with Many-in-One Flexible 3D Gaussian Splatting"☆46Jun 20, 2025Updated 11 months ago
- HerosNet: Hyperspectral Explicable Reconstruction and Optimal Sampling Deep Network for Snapshot Compressive Imaging (CVPR2022)☆20May 16, 2022Updated 4 years ago
- [NeurIPS'24] I2EBench: A Comprehensive Benchmark for Instruction-based Image Editing☆33Dec 9, 2025Updated 5 months ago
- Open deep learning compiler stack for cpu, gpu and specialized accelerators☆19May 22, 2026Updated last week
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- (Nature Communications Engineering 2024) Compressive Confocal Microscopy Imaging at the Single-Photon Level with Ultra-Low Sampling Ratio…☆27Mar 9, 2025Updated last year
- [ACM MM 24 Best Paper Nomination] ResVR: Joint Rescaling and Viewport Rendering of Omnidirectional Images☆24Dec 9, 2024Updated last year
- Dynamic Scene Representation Gaussian Splatting☆39Jun 30, 2025Updated 10 months ago
- ☆14Apr 25, 2025Updated last year
- A simple Python tool to measure the performance of ONNX models.☆27Sep 15, 2024Updated last year
- SLiM: One-shot Quantized Sparse Plus Low-rank Approximation of LLMs (ICML 2025)☆36Nov 28, 2025Updated 6 months ago
- [CVPR2025 Highlight] Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models☆239Nov 7, 2025Updated 6 months ago
- ☆20Apr 24, 2024Updated 2 years ago
- [ICCV 2025] Dynamic-VLM☆28Dec 16, 2024Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Focal loss implemention by PyTorch☆11Dec 16, 2018Updated 7 years ago
- ⛏️This is the storage of my Slides、Reports and Papers. | 存储PPT、报告和论文☆12Oct 27, 2024Updated last year
- Meta-Prompting for Automating Zero-shot Visual Recognition with LLMs (ECCV 2024)☆19Jul 15, 2024Updated last year
- 5th place solution for ACM MM2021 Robust Logo Detection Grand Challenge☆13Dec 25, 2022Updated 3 years ago
- MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering. A comprehensive evaluation of multimodal large model multilingua…☆64May 15, 2025Updated last year
- ☆12Apr 1, 2025Updated last year
- [ICML 2026] The official implementation of paper "Generation Enhances Understanding in Unified Multimodal Models via Multi-Representation…☆72Updated this week
- [TCSVT 2024] Progressive Content-aware Coded Hyperspectral Compressive Imaging☆49Nov 12, 2024Updated last year
- The offical implementation of 'FFAA: Multimodal Large Language Model based Explainable Open-World Face Forgery Analysis Assistant'☆49Nov 22, 2024Updated last year
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- ☆19Nov 18, 2024Updated last year
- [TCSVT 2024] D3C2-Net: Dual-Domain Deep Convolutional Coding Network for Compressive Sensing☆45Dec 9, 2024Updated last year
- This repository provides an improved LLamaGen Model, fine-tuned on 500,000 high-quality images, each accompanied by over 300 token prompt…☆30Oct 21, 2024Updated last year
- Code and Dataset for the CVPRW Paper "Where did I leave my keys? — Episodic-Memory-Based Question Answering on Egocentric Videos"☆29Aug 28, 2023Updated 2 years ago
- ☆41Nov 22, 2025Updated 6 months ago
- Official Implementation for "SiLVR : A Simple Language-based Video Reasoning Framework"☆19Jan 18, 2026Updated 4 months ago
- ☆26Jun 5, 2025Updated 11 months ago