Pixio: a capable vision encoder dedicated to dense prediction, simply by pixel reconstruction
☆351Jan 22, 2026Updated last month
Alternatives and similar repositories for pixio
Users that are interested in pixio are comparing it to the libraries listed below
Sorting:
- [ICCV 2025] Boosting Multi-View Indoor 3D Object Detection via Adaptive 3D Volume Construction☆23Oct 1, 2025Updated 5 months ago
- ☆22Dec 11, 2024Updated last year
- [WACV 2025] Official code of "SEED4D: A Synthetic Ego-Exo Dynamic 4D Data Generator, Driving Dataset and Benchmark"☆21Sep 3, 2025Updated 6 months ago
- [ICCV 2025] UniVerse: Unleashing the Scene Prior of Video Diffusion Models for Robust Radiance Field Reconstruction☆26Oct 3, 2025Updated 5 months ago
- Cameras as Relative Positional Encoding☆676Dec 18, 2025Updated 2 months ago
- UniFork: Exploring Modality Alignment for Unified Multimodal Understanding and Generation☆46Aug 26, 2025Updated 6 months ago
- [CVPR 2026] "E-RayZer: Self-supervised 3D Reconstruction as Spatial Visual Pre-training" official implementation.☆264Feb 24, 2026Updated last week
- Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders☆208Feb 13, 2026Updated 3 weeks ago
- PyTorch implementation of NEPA☆324Feb 9, 2026Updated 3 weeks ago
- [ICCV 2025] SpatialTrackerV2: 3D Point Tracking Made Easy☆918Updated this week
- [Arxiv'25] DINO-Tok: Adapting DINO for Visual Tokenizers☆35Nov 25, 2025Updated 3 months ago
- UniGeo: Taming Video Diffusion for Unified Consistent Geometry Estimation☆135Jun 10, 2025Updated 8 months ago
- [ICLR 2026] pi-Flow: Policy-Based Few-Step Generation via Imitation Distillation☆271Feb 23, 2026Updated last week
- Public code for XFactor: Introduces the first geometry-free model to achieve true self-supervised / pose-free Novel View Synthesis (NVS) …☆127Oct 22, 2025Updated 4 months ago
- Original reference implementation of "StopThePop: Sorted Gaussian Splatting for View-Consistent Real-time Rendering"☆239May 24, 2024Updated last year
- The official code of Yume☆621Jan 14, 2026Updated last month
- CoWTracker: Tracking by Warping instead of Correlation☆107Feb 5, 2026Updated last month
- Official Repo for Fast-SAM3D: 3Dfy Anything in Images but Faster☆97Feb 19, 2026Updated 2 weeks ago
- Code of WinT3R: Window-Based Streaming Rrconstruction With Camera Token Pool☆221Feb 13, 2026Updated 2 weeks ago
- Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence☆279Updated this week
- [ECCV 2024] Official PyTorch implementation of RoPE-ViT "Rotary Position Embedding for Vision Transformer"☆441Oct 29, 2025Updated 4 months ago
- Official Repo for Self-Forcing++ High Quality Long Video Generation☆241Oct 13, 2025Updated 4 months ago
- [SIGGRAPH Asia 2025] WorldExplorer: Towards Generating Fully Navigable 3D Scenes☆181Dec 8, 2025Updated 2 months ago
- Official implementation of our paper "Flow-Anything: Learning Real-World Optical Flow Estimation from Large-Scale Single-view Images"☆73Jun 10, 2025Updated 8 months ago
- RePlan: Reasoning-Guided Region Planning for Complex Instruction-Based Image Editing☆58Dec 26, 2025Updated 2 months ago
- [ECCV2024] Event-Based Motion Magnification☆67Jul 4, 2024Updated last year
- [CVPR 2026] FocusUI: Efficient UI Grounding via Position-Preserving Visual Token Selection☆25Feb 10, 2026Updated 3 weeks ago
- [ICCV 2025 Oral] Back on Track: Bundle Adjustment for Dynamic Scene Reconstruction (BA-Track)☆95Nov 25, 2025Updated 3 months ago
- [WACV 2026] PyTorch code for 4D-Animal.☆27Nov 18, 2025Updated 3 months ago
- [ICML-2025] We introduce Lie group Relative position Encodings (LieRE) that goes beyond RoPE in supporting n-dimensional inputs.☆14Aug 8, 2025Updated 6 months ago
- The official implementation of Diffusion Distillation With Direct Preference Optimization For Efficient 3D LiDAR Scene Completion [AAAI'2…☆15Feb 2, 2026Updated last month
- Data and code for EACL'24 paper: Over-Reasoning and Redundant Calculation of Large Language Models☆11Jan 23, 2024Updated 2 years ago
- The official code of "Beyond VLM-Based Rewards: Diffusion-Native Latent Reward Modeling"☆47Feb 26, 2026Updated last week
- ☆208Dec 9, 2024Updated last year
- State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!☆2,177Feb 11, 2026Updated 3 weeks ago
- Official codes for the paper "GARDO: Reinforcing Diffusion Models without Reward Hacking"☆56Feb 2, 2026Updated last month
- [CVPR 2026] The official implementation of The paper "Are We Ready for RL in Text-to-3D Generation? A Progressive Investigation"☆105Updated this week
- [ICLR 2026] π^3: Permutation-Equivariant Visual Geometry Learning☆1,673Feb 27, 2026Updated last week
- Official repo for paper "EMMA: Efficient Multimodal Understanding, Generation, and Editing with a Unified Architecture."☆62Dec 16, 2025Updated 2 months ago