Official implementation of Scaling Laws in Patchification: An Image Is Worth 50,176 Tokens And More
☆25Feb 25, 2025Updated last year
Alternatives and similar repositories for Patch_Scaling
Users that are interested in Patch_Scaling are comparing it to the libraries listed below
Sorting:
- ☆28Feb 27, 2025Updated last year
- ☆82Feb 27, 2025Updated last year
- Official PyTorch implementation of Agglomerative Token Clustering presented at ECCV 2024☆20Sep 19, 2024Updated last year
- ☆20Jan 23, 2024Updated 2 years ago
- [EMNLP 2024] Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality☆21Oct 8, 2024Updated last year
- Extending context length of visual language models☆12Dec 18, 2024Updated last year
- ☆14Apr 15, 2024Updated last year
- [EMNLP'2023 Findings] MoqaGPT, for zero-shot multimodal question answering with LLMs☆13Dec 28, 2024Updated last year
- [ICCV 2025] Official Implementation of Steering Rectified Flow Models in the Vector Field for Controlled Image Generation☆45Jun 27, 2025Updated 8 months ago
- This is the implementation of CounterCurate, the data curation pipeline of both physical and semantic counterfactual image-caption pairs.☆19Jun 27, 2024Updated last year
- The source code of [WWW 2025] MoDiCF☆12Jul 12, 2025Updated 8 months ago
- [ICLR2025] This repository is the official implementation of our Autoregressive Pretraining with Mamba in Vision☆90May 30, 2025Updated 9 months ago
- [EMNLP'2024 Findings] Explore generated documents for enhanced IR with LLMs. We enhance BM25 to surpass strong dense retriever on many da…☆15Mar 28, 2025Updated 11 months ago
- Complex-Edit: CoT-Like Instruction Generation for Complexity-Controllable Image Editing Benchmark☆28Apr 22, 2025Updated 10 months ago
- This is the official implementation of the paper “Griffin: Towards a Graph-Centric Relational Database Foundation Model.”☆35Sep 25, 2025Updated 5 months ago
- Source code related to the research paper entitled RVENet: A Large Echocardiographic Dataset for the Deep Learning-Based Assessment of Ri…☆12Mar 10, 2024Updated 2 years ago
- ☆133Jun 26, 2024Updated last year
- [NeurIPS24] VisMin: Visual Minimal-Change Understanding☆19Mar 3, 2025Updated last year
- [ICLR 2025] Breaking Mental Set to Improve Reasoning through Diverse Multi-Agent Debate☆19Apr 22, 2025Updated 11 months ago
- ☆25Apr 10, 2025Updated 11 months ago
- ☆10Apr 8, 2018Updated 7 years ago
- Collections of RLxLM experiments using minimal codes☆14Feb 17, 2025Updated last year
- ☆12Nov 16, 2020Updated 5 years ago
- ☆24Feb 14, 2025Updated last year
- cliptrase☆47Sep 1, 2024Updated last year
- ☆14Jul 2, 2024Updated last year
- Code for "A Data-Centric Approach To Generate Faithful and High Quality Patient Summaries with Large Language Models"☆17Jul 20, 2025Updated 8 months ago
- ✨ Official code for our paper: "Uncertainty-o: One Model-agnostic Framework for Unveiling Epistemic Uncertainty in Large Multimodal Model…☆20Mar 13, 2025Updated last year
- ☆27Dec 1, 2025Updated 3 months ago
- The Matlab Code for the AISTATS 2015 paper "Learning Deep Sigmoid Belief Network with Data Augmentation"☆13Sep 20, 2015Updated 10 years ago
- visual question answering prompting recipes for large vision-language models☆28Sep 14, 2024Updated last year
- ☆15Jul 9, 2025Updated 8 months ago
- Code for MGDM algorithm, ICML 2025, https://arxiv.org/abs/2502.03332☆15May 19, 2025Updated 10 months ago
- Published in Nature Communications☆12Feb 19, 2024Updated 2 years ago
- [WSDM 2025] Source code for "Spectrum-based Modality Representation Fusion Graph Convolutional Network for Multimodal Recommendation".☆37Dec 22, 2024Updated last year
- Code for LDLForests☆20Oct 4, 2018Updated 7 years ago
- Official codebase for “In-Context Learning with Many Demonstration Examples”☆16Feb 13, 2023Updated 3 years ago
- ☆14Feb 21, 2022Updated 4 years ago
- Official Implementation of paper "Distilling Long-tailed Datasets" [CVPR 2025]☆21Aug 13, 2025Updated 7 months ago