NimrodShabtay / LiveXivLinks
☆12Updated 2 months ago
Alternatives and similar repositories for LiveXiv
Users that are interested in LiveXiv are comparing it to the libraries listed below
Sorting:
- 2D-TPE: Two-Dimensional Positional Encoding Enhances Table Understanding for Large Language Models (WWW 2025)☆10Updated 5 months ago
- ☆10Updated 10 months ago
- The implementation for FREE-Merging: Fourier Transform for Model Merging with Lightweight Experts (ICCV25)☆10Updated 3 months ago
- [ICCV 2025 Oral] CorrCLIP: Reconstructing Patch Correlations in CLIP for Open-Vocabulary Semantic Segmentation☆23Updated last month
- [⭐️ WACV 2025 Oral ⭐️] PETALface: Parameter Efficient Transfer Learning for Low-resolution Face Recognition☆19Updated 3 months ago
- KV cache compression via sparse coding☆14Updated 4 months ago
- ☆14Updated 5 months ago
- Context-Informed Machine Translation of Manga using Multimodal Large Language Models☆11Updated 9 months ago
- Code for the paper "ShowHowTo: Generating Scene-Conditioned Step-by-Step Visual Instructions" published at CVPR 2025☆20Updated 6 months ago
- Instagram Automation Tool is a framework that automates various Instagram tasks, including file-based operations and web automation (via …☆16Updated 4 months ago
- ☆21Updated 2 months ago
- ☆25Updated 2 months ago
- 本项目主要是2025届浙江大学软件学院夏令营(AI营)的考核项目☆11Updated 6 months ago
- unofficial☆11Updated 11 months ago
- [WACV2025] source code of StrDA: https://arxiv.org/abs/2410.09913☆11Updated 5 months ago
- Video-Holmes: Can MLLM Think Like Holmes for Complex Video Reasoning?☆74Updated 2 months ago
- High-Resolution Visual Reasoning via Multi-Turn Grounding-Based Reinforcement Learning☆48Updated 2 months ago
- Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment☆60Updated 2 months ago
- Official Implementation of Muddit [Meissonic II]: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusion Model.☆85Updated last month
- The official code of "PixelWorld: Towards Perceiving Everything as Pixels" [TMLR25]☆15Updated 2 weeks ago
- [ICML 2025] VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models☆35Updated 3 months ago
- ☆27Updated 2 months ago
- DELT: Data Efficacy for Language Model Training☆34Updated 3 weeks ago
- Official implementation for P2SAM (ACM MM 2024)☆13Updated 9 months ago
- ☆15Updated 9 months ago
- ☆32Updated 4 months ago
- ☆12Updated last month
- Official codebase for "Analyzing the Generalization and Reliability of Steering Vectors"☆15Updated 9 months ago
- On Path to Multimodal Generalist: General-Level and General-Bench☆19Updated 2 months ago
- ☆75Updated 3 months ago