Official implementation of Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal Learning
☆28Oct 30, 2024Updated last year
Alternatives and similar repositories for VisInContext
Users that are interested in VisInContext are comparing it to the libraries listed below
Sorting:
- ☆74May 10, 2024Updated last year
- A Large-scale Dataset for training and evaluating model's ability on Dense Text Image Generation☆87Sep 27, 2025Updated 5 months ago
- ☆24Jun 18, 2025Updated 9 months ago
- ECCV2024_Parrot Captions Teach CLIP to Spot Text☆66Sep 6, 2024Updated last year
- Improving Language Understanding from Screenshots. Paper: https://arxiv.org/abs/2402.14073