silicx / GoldFromOres-BiLP
Preview code of ECCV'24 paper "Distill Gold from Massive Ores" (BiLP)
☆24Updated 10 months ago
Alternatives and similar repositories for GoldFromOres-BiLP
Users that are interested in GoldFromOres-BiLP are comparing it to the libraries listed below
Sorting:
- Official implementation of Dancing with Still Images: Video Distillation via Static-Dynamic Disentanglement.☆29Updated 8 months ago
- Official implementation of ECCV 2024 paper: Take A Step Back: Rethinking the Two Stages in Visual Reasoning☆11Updated 7 months ago
- This the official repository of OCL (ICCV 2023).☆20Updated last year
- Code for our ICML'24 on multimodal dataset distillation☆37Updated 7 months ago
- [CVPR 2025] Official PyTorch Implementation of GLUS: Global-Local Reasoning Unified into A Single Large Language Model for Video Segmenta…☆36Updated 3 weeks ago
- Official PyTorch Implementation of Learning Affordance Grounding from Exocentric Images, CVPR 2022☆59Updated 6 months ago
- An unofficial pytorch dataloader for Open X-Embodiment Datasets https://github.com/google-deepmind/open_x_embodiment☆14Updated 4 months ago
- ☆46Updated 4 months ago
- Latent Motion Token as the Bridging Language for Robot Manipulation☆85Updated this week
- Affordance Grounding from Demonstration Video to Target Image (CVPR 2023)☆44Updated 9 months ago
- (ECCV 2024) Official repository of paper "EgoExo-Fitness: Towards Egocentric and Exocentric Full-Body Action Understanding"☆28Updated last month
- Accepted by CVPR 2024☆33Updated 11 months ago
- ☆16Updated 10 months ago
- Data pre-processing and training code on Open-X-Embodiment with pytorch☆11Updated 3 months ago
- [ICML 2024] A Touch, Vision, and Language Dataset for Multimodal Alignment☆76Updated 3 months ago
- [ICLR 2024] Seer: Language Instructed Video Prediction with Latent Diffusion Models☆31Updated 11 months ago
- [CVPR 2024] Binding Touch to Everything: Learning Unified Multimodal Tactile Representations☆51Updated 3 months ago
- ☆69Updated 5 months ago
- (NeurIPS 2024 Spotlight) TOPA: Extend Large Language Models for Video Understanding via Text-Only Pre-Alignment☆30Updated 7 months ago
- ☆78Updated this week
- [NeurIPS 2024] Official Repository of Multi-Object Hallucination in Vision-Language Models☆29Updated 6 months ago
- [CVPR 2025 (Oral)] Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the Key☆51Updated last month
- Official repository for "iVideoGPT: Interactive VideoGPTs are Scalable World Models" (NeurIPS 2024), https://arxiv.org/abs/2405.15223☆129Updated 2 months ago
- IMProv: Inpainting-based Multimodal Prompting for Computer Vision Tasks☆58Updated 7 months ago
- [ICML 2025] OTTER: A Vision-Language-Action Model with Text-Aware Visual Feature Extraction☆73Updated 3 weeks ago
- Official code for "AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruning"☆26Updated last month
- Official Implementation of CAPEAM (ICCV'23)☆13Updated 5 months ago
- [World-Model-Survey-2024] Paper list and projects for World Model☆9Updated 6 months ago
- [AAAI2023] Symbolic Replay: Scene Graph as Prompt for Continual Learning on VQA Task (Oral)☆39Updated last year
- AnyBimanual: Transfering Unimanual Policy for General Bimanual Manipulation☆71Updated last month