Official implementation of Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal Learning
☆28Oct 30, 2024Updated last year
Alternatives and similar repositories for VisInContext
Users that are interested in VisInContext are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆75May 10, 2024Updated 2 years ago
- FQGAN: Factorized Visual Tokenization and Generation☆59Mar 29, 2025Updated last year
- ECCV2024_Parrot Captions Teach CLIP to Spot Text☆66Sep 6, 2024Updated last year
- Improving Language Understanding from Screenshots. Paper: https://arxiv.org/abs/2402.14073☆31Jul 9, 2024Updated last year
- ☆23May 5, 2026Updated 2 weeks ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆67May 2, 2026Updated 2 weeks ago
- ☆14Dec 9, 2023Updated 2 years ago
- ☆14Sep 28, 2020Updated 5 years ago
- ☆60Apr 28, 2025Updated last year
- TPDiff: Temporal Pyramid Video Diffusion Model☆25Mar 13, 2025Updated last year
- [CVPR 2025] PyTorch implementation of paper "FLAME: Frozen Large Language Models Enable Data-Efficient Language-Image Pre-training"☆33Jul 8, 2025Updated 10 months ago
- [NeurIPS 2025 Spotlight] Official PyTorch implementation of Vgent☆45Nov 30, 2025Updated 5 months ago
- ☆11May 24, 2024Updated last year
- [ Arxiv 2023 ] This repository contains the code for "MUPPET: Multi-Modal Few-Shot Temporal Action Detection"☆16Aug 30, 2023Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- FreeVA: Offline MLLM as Training-Free Video Assistant☆69Jun 9, 2024Updated last year
- FeatureNeRF: Learning Generalizable NeRFs by Distilling Foundation Models, ICCV 2023☆13Jul 13, 2024Updated last year
- Code release for "Category-Specific Prompts for Animal Action Recognition with Pretrained Vision-Language Models"☆14Feb 21, 2024Updated 2 years ago
- [ICCV 2021] Multimodal Knowledge Expansion☆10Aug 28, 2021Updated 4 years ago
- A curated list of all awesome pygames created by Agneay B Nair☆10Apr 28, 2024Updated 2 years ago
- Enable AI to control your PC. This repo includes the WorldGUI Benchmark and GUI-Thinker Agent Framework.☆122Jul 27, 2025Updated 9 months ago
- (IJCV 2023) Offical implementation of "SCT: A Simple Baseline for Parameter-Efficient Fine-Tuning via Salient Channels"☆13Mar 20, 2025Updated last year
- Adaptation datasets and scripts for the paper "Reducing gender bias in Neural Machine Translation as a domain adaptation problem" (ACL 20…☆13Mar 18, 2021Updated 5 years ago
- Reference implementation of the paper "Efficient and Scalable Graph Generation through Iterative Local Expansion"☆17Aug 27, 2025Updated 8 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- [AAAI 2026] SlideTailor: Personalized Presentation Slide Generation for Scientific Papers☆55Apr 18, 2026Updated last month
- Source code and data for ADEPT: A DEbiasing PrompT Framework (AAAI-23).☆15Dec 13, 2024Updated last year
- ☆38Jan 9, 2026Updated 4 months ago
- [EMNLP'24 (Main)] DRPO(Dynamic Rewarding with Prompt Optimization) is a tuning-free approach for self-alignment. DRPO leverages a search-…☆25Nov 17, 2024Updated last year
- ☆17Aug 1, 2024Updated last year
- Common Sense or World Knowledge? Investigating Adapter-Based Knowledge Injection into Pretrained Transformers☆20Feb 22, 2021Updated 5 years ago
- Codebase for Linguistic Collapse: Neural Collapse in (Large) Language Models [NeurIPS 2024] [arXiv:2405.17767]☆18Apr 14, 2025Updated last year
- [ACM MM 2022] MM_Pyramid: Multimodal Pyramid Attentional Network for Audio-Visual Event Localization and Video Parsing☆16Aug 26, 2022Updated 3 years ago
- ☆29Apr 28, 2026Updated 3 weeks ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Subspace Graph Physics☆15Jun 14, 2024Updated last year
- Grounding Language Models for Compositional and Spatial Reasoning☆18Oct 26, 2022Updated 3 years ago
- NegCLIP.☆41Feb 6, 2023Updated 3 years ago
- ☆15Apr 25, 2023Updated 3 years ago
- ☆58Apr 24, 2024Updated 2 years ago
- The first open-domain closed-loop revisited benchmark for evaluating memory consistency and action control in world models.☆58Feb 10, 2026Updated 3 months ago
- An MCP Server that works with Roo Code/Cline.Bot/Claude Desktop to optimize costs by intelligently routing coding tasks between local LLM…☆41Updated this week