Official Repository of OmniCaptioner
☆168Apr 23, 2025Updated last year
Alternatives and similar repositories for OmniCaptioner
Users that are interested in OmniCaptioner are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- (ICCV-2025 Official Code)) Improving Generalist Model with Domain-Specific Experts☆87Oct 29, 2025Updated 6 months ago
- Official Repository: A Comprehensive Benchmark for Logical Reasoning in MLLMs☆45Jun 17, 2025Updated 11 months ago
- [T-PAMI 2024] & [CVPR 2023] Vote2Cap-DETR; A set-to-set perspective towards 3D Dense Captioning; State-of-the-Art 3D Dense Captioning met…☆104Aug 17, 2024Updated last year
- [Neural Networks 2025] The official code for the paper "MNet: A Multi-Scale Network for Visible Watermark Removal."☆17Jun 16, 2025Updated 11 months ago
- [NeurIPS'24] Training-Free Adaptive Diffusion with Bounded Difference Approximation Strategy☆73Jan 22, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- This is the open-source code for TokenCarve.☆26Jan 23, 2026Updated 4 months ago
- Official repository for "TrustGeoGen: Formal-Verified Data Engine for Trustworthy Multi-modal Geometric Problem Solving"☆23Sep 1, 2025Updated 8 months ago
- ☆26Aug 9, 2025Updated 9 months ago
- The implementation for FREE-Merging: Fourier Transform for Model Merging with Lightweight Experts (ICCV25)☆15Jun 26, 2025Updated 11 months ago
- Adapter-X: A Novel General Parameter-Efficient Fine-Tuning Framework for Vision☆11Jul 22, 2024Updated last year
- [ICLR 2026] The official implementation of "RegionE: Adaptive Region-Aware Generation for Efficient Image Editing"☆105Feb 3, 2026Updated 3 months ago
- ☆27Mar 3, 2025Updated last year
- ☆11Nov 12, 2018Updated 7 years ago
- [EMNLP 2024 Findings🔥] Official implementation of ": LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context In…☆103Nov 9, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Multimodal Document Intelligence Platform☆41Apr 10, 2026Updated last month
- [ICLR 2026] Official Implementation of ProxyThinker: Test-Time Guidance through Small Visual Reasoners.☆22Sep 24, 2025Updated 8 months ago
- (ACL-2025 main conference) SurveyForge: On the Outline Heuristics, Memory-Driven Generation, and Multi-dimensional Evaluation for Automat…☆333Aug 27, 2025Updated 9 months ago
- Official codes for "Q-Ground: Image Quality Grounding with Large Multi-modality Models", ACM MM2024 (Oral)☆44Apr 21, 2026Updated last month
- Powerful Python-based tool for scraping Tweets, user data, and trends from Twitter without needing API access or authentication, offering…☆130Jan 4, 2025Updated last year
- 数字底座是一款面向大型政府、企业数字化转型,基于身份认证、组织架构、岗位职务、应用系统、资源角色、数据目录、安全控制等功能构建的统一且安全的管理支撑平台。数字底座基于三员管理模式,具备微服务、多租户、容器化和国产化,支持用户利用代码生成器快速构建自己的业务应用,同时可关联诸…☆2,591May 12, 2026Updated 2 weeks ago
- Comprehensive AI-powered urban development optimization platform that combines deep learning and reinforcement learning for data-driven b…☆35Nov 26, 2025Updated 6 months ago
- [ICCV2025] VEGGIE: Instructional Editing and Reasoning Video Concepts with Grounded Generation☆33Aug 18, 2025Updated 9 months ago
- [Neurips 2025] R-KV: Redundancy-aware KV Cache Compression for Reasoning Models☆1,198Oct 16, 2025Updated 7 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Two languages, one purpose: turning words into geometry.☆160Dec 31, 2025Updated 4 months ago
- Official Implementation for "SiLVR : A Simple Language-based Video Reasoning Framework"☆19Jan 18, 2026Updated 4 months ago
- A SAR domain-specific language defined in CXX & Python. Keywords: AST, MLIR, LLVM, FPGA HLS. Currently under development...☆17Mar 28, 2026Updated last month
- ☆73May 17, 2025Updated last year
- [arXiv 2024] Is Oracle Pruning the True Oracle?☆26Jan 10, 2025Updated last year
- Your codebase was probably AI-generated. Get a better handle on it. Noodles creates interactive diagrams that visualize how your code act…☆1,275Mar 13, 2026Updated 2 months ago
- ☆24Jun 18, 2025Updated 11 months ago
- The accepted paper for cvpr2025.☆56Dec 9, 2025Updated 5 months ago
- 🔥minerproxy,minerproxy,minerproxy,minerproxy,minerproxy,minerproxy,minerproxy,minerproxy,minerproxy,minerproxy,矿池抽水,矿池中转,矿场运维专用☆3,560May 18, 2026Updated last week
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- [ AAAI26 ]: “VTinker: Guided Flow Upsampling and Texture Mapping for High-Resolution Video Frame Interpolation”☆19Mar 26, 2026Updated 2 months ago
- [NeurIPS 2025 D&B Track] MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Research☆29May 8, 2026Updated 2 weeks ago
- 🏭 Mega Scale Multimodal DataPipeline for SOTA Foundation Models☆364May 12, 2026Updated 2 weeks ago
- Low-Rank Rescaled Vision Transformer Fine-Tuning: A Residual Design Approach, CVPR 2024☆26Jul 25, 2024Updated last year
- The next generation deep reinforcement learning tookit☆3,464Jun 16, 2023Updated 2 years ago
- ☆25May 13, 2024Updated 2 years ago
- [CVPR'24] Once for Both: Single Stage of Importance and Sparsity Search for Vision Transformer Compression☆15Jul 1, 2024Updated last year