NimrodShabtay / LiveXivLinks
☆13Updated 6 months ago
Alternatives and similar repositories for LiveXiv
Users that are interested in LiveXiv are comparing it to the libraries listed below
Sorting:
- 2D-TPE: Two-Dimensional Positional Encoding Enhances Table Understanding for Large Language Models (WWW 2025)☆10Updated 9 months ago
- Official Pytorch Implementation of "Cross-Attention Head Position Patterns Can Align with Human Visual Concepts in Text-to-Image Generati…☆11Updated 5 months ago
- The implementation for FREE-Merging: Fourier Transform for Model Merging with Lightweight Experts (ICCV25)☆13Updated 7 months ago
- ☆25Updated 7 months ago
- ☆16Updated 9 months ago
- ☆27Updated 6 months ago
- ☆10Updated last year
- KV cache compression via sparse coding☆17Updated 3 months ago
- High-Resolution Visual Reasoning via Multi-Turn Grounding-Based Reinforcement Learning☆52Updated 6 months ago
- ☆14Updated last year
- Official codebase for "Analyzing the Generalization and Reliability of Steering Vectors"☆19Updated last year
- A Comprehensive Dataset for Advanced Image Generation and Editing}☆31Updated 3 months ago
- Evaluating Deep Multimodal Reasoning in Vision-Centric Agentic Tasks☆34Updated 2 months ago
- ☆63Updated 6 months ago
- Code for the paper "ShowHowTo: Generating Scene-Conditioned Step-by-Step Visual Instructions" published at CVPR 2025☆20Updated 10 months ago
- unofficial☆12Updated last year
- 本项目主要是2025届浙江大学软件学院夏令营(AI营)的考核项目☆12Updated 10 months ago
- ☆41Updated 8 months ago
- ☆21Updated 4 months ago
- [NeurIPS 2024] TransAgent: Transfer Vision-Language Foundation Models with Heterogeneous Agent Collaboration☆26Updated last year
- ☆35Updated 2 months ago
- This repo contains code for the paper "Both Text and Images Leaked! A Systematic Analysis of Data Contamination in Multimodal LLM"☆17Updated 3 months ago
- The official repo for the DanQing dataset.☆26Updated 2 weeks ago
- Instagram Automation Tool is a framework that automates various Instagram tasks, including file-based operations and web automation (via …☆15Updated 8 months ago
- PyTorch code for "ADEM-VL: Adaptive and Embedded Fusion for Efficient Vision-Language Tuning"☆21Updated last year
- \infty-Video: A Training-Free Approach to Long Video Understanding via Continuous-Time Memory Consolidation☆19Updated 11 months ago
- [ICCV 2025] Dynamic-VLM☆28Updated last year
- Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment☆64Updated 6 months ago
- CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning☆33Updated 5 months ago
- ☆80Updated 7 months ago