Towards Pixel-Level VLM Perception via Simple Points Prediction
☆103Feb 9, 2026Updated 2 months ago
Alternatives and similar repositories for SimpleSeg
Users that are interested in SimpleSeg are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ArXiv 26] The official repository of "ArtHOI: Articulated Human-Object Interaction Synthesis by 4D Reconstruction from Video Priors".☆32Mar 5, 2026Updated last month
- M³: Dense Matching Meets Multi-View Foundation Models for Monocular Gaussian Splatting SLAM☆61Mar 18, 2026Updated last month
- ☆86Oct 10, 2025Updated 6 months ago
- MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head (ICLR 2026)☆144Apr 17, 2026Updated last week
- In-Context Reinforcement Learning for Tool Use in Large Language Models☆46Mar 26, 2026Updated last month
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- [CVPR 2026 Highlight] SegEarth-R2: Towards Comprehensive Language-guided Segmentation for Remote Sensing Images☆53Jan 24, 2026Updated 3 months ago
- [ACL'26] EvoToken-DLM (Beyond Hard Masks: Progressive Token Evolution for Diffusion Language)☆47Apr 7, 2026Updated 3 weeks ago
- ☆13Feb 2, 2025Updated last year
- ☆12Oct 7, 2024Updated last year
- Efficient Feature Extraction for High-resolution Video Frame Interpolation (BMVC 2022)☆14Aug 24, 2023Updated 2 years ago
- ☆20Nov 16, 2025Updated 5 months ago
- ☆14Jan 22, 2025Updated last year
- official implementation of Splat Feature Solver: https://arxiv.org/abs/2508.12216☆38Feb 4, 2026Updated 2 months ago
- [NeurIPS 2025] Frame In-N-Out: Unbounded Controllable Image-to-Video Generation☆31Jan 5, 2026Updated 3 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- ☆36Jan 30, 2026Updated 2 months ago
- Official implementation of "VideoMaMa: Mask-Guided Video Matting via Generative Prior", CVPR 2026☆435Apr 1, 2026Updated 3 weeks ago
- Official Repository for "Finding NeMO: A Geometry-Aware Representation of Template Views for Few-Shot Perception"☆22Feb 17, 2026Updated 2 months ago
- Geo-OLMs Repo: Accepted to ACM COMPASS 2025☆23Jun 17, 2025Updated 10 months ago
- GHUStereo models are novel real-time stereo matching architectures with a low computation complexity characterized by compact cost volum…☆31Dec 14, 2025Updated 4 months ago
- ☆27Jun 3, 2025Updated 10 months ago
- A unified robotic manipulation learning framework☆22Sep 4, 2025Updated 7 months ago
- OneGov GEVER core package☆12Apr 20, 2026Updated last week
- The evaluation code for A Safety Report on GPT-5.2, Gemini 3 Pro, Qwen3-VL, Grok 4.1 Fast, Nano Banana Pro, and Seedream 4.5☆53Jan 18, 2026Updated 3 months ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Residual Context Diffusion (RCD): Repurposing discarded signals as structured priors for high-performance reasoning in dLLMs.☆56Mar 12, 2026Updated last month
- TBD☆56Mar 13, 2026Updated last month
- [ACL 2026 Findings] CoV: Chain-of-View Prompting for Spatial Reasoning☆58Apr 7, 2026Updated 3 weeks ago
- ☆43Sep 1, 2025Updated 7 months ago
- Green-VLA: Staged Vision-Language-Action Model for Generalist Robots☆122Mar 5, 2026Updated last month
- Video Depth Propagation [3DV 2026]☆35Jan 23, 2026Updated 3 months ago
- Beyond KV Caching: Shared Attention for Efficient LLMs☆20Jul 19, 2024Updated last year
- [arXiv 2026] Official PyTorch Repository for "Coarse-Guided Visual Generation via Weighted h-Transform Sampling"☆41Mar 16, 2026Updated last month
- MonetGPT: Solving Puzzles Enhances MLLMs' Image Retouching Skills [SIGGRAPH 2025]☆83Jan 21, 2026Updated 3 months ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- GEOSatDB is a semantic representation of Earth observation satellites and sensors that can be used to easily discover available Earth obs…☆15Aug 6, 2024Updated last year
- Weather4Cast 2023 NeurIPS Competition - RainAI☆16Dec 4, 2023Updated 2 years ago
- Interaction Compass: Multi-Label Zero-Shot Learning of Human-Object Interactions via Spatial Relations @ ICCV21☆13Jul 15, 2022Updated 3 years ago
- ☆228Jul 17, 2025Updated 9 months ago
- Official repo for StyleMe3D☆30Apr 22, 2025Updated last year
- UltraFlux: Data-Model Co-Design for High-quality Native 4K Text-to-Image Generation across Diverse Aspect Ratios☆129Apr 9, 2026Updated 2 weeks ago
- Distribute and run transformer encoders with a single file.☆93Updated this week