NimrodShabtay / LiveXivLinks
☆10Updated this week
Alternatives and similar repositories for LiveXiv
Users that are interested in LiveXiv are comparing it to the libraries listed below
Sorting:
- [ICLR 2025] Causal Graphical Models for Vision-Language Compositional Understanding☆9Updated 3 months ago
- [⭐️ WACV 2025 Oral ⭐️] PETALface: Parameter Efficient Transfer Learning for Low-resolution Face Recognition☆13Updated last month
- 2D-TPE: Two-Dimensional Positional Encoding Enhances Table Understanding for Large Language Models (WWW 2025)☆10Updated 3 months ago
- Official Pytorch Implementation of "Cross-Attention Head Position Patterns Can Align with Human Visual Concepts in Text-to-Image Generati…☆9Updated 7 months ago
- KV cache compression via sparse coding☆11Updated 2 months ago
- Renderer for the Crello dataset☆9Updated 5 months ago
- ☆22Updated 2 weeks ago
- Official Repository of Personalized Visual Instruct Tuning☆31Updated 4 months ago
- ☆9Updated 6 months ago
- ☆12Updated 5 months ago
- Code for the paper "ShowHowTo: Generating Scene-Conditioned Step-by-Step Visual Instructions" published at CVPR 2025☆16Updated 4 months ago
- The code for the paper "Efficient Self-Supervised Video Hashing with Selective State Spaces" (AAAI'25).☆18Updated 6 months ago
- SAVEn-Vid: Synergistic Audio-Visual Integration for Enhanced Understanding in Long Video Context☆5Updated 6 months ago
- The implementation of our NeurIPS 2024 paper "DarkSAM: Fooling Segment Anything Model to Segment Nothing".☆11Updated 8 months ago
- [ICLR 2025] ELICIT: LLM Augmentation Via External In-context Capability☆11Updated 4 months ago
- Official Repo for FoodieQA paper (EMNLP 2024)☆16Updated 3 weeks ago
- LLaVA-MR: Large Language-and-Vision Assistant for Video Moment Retrieval☆8Updated 7 months ago
- ☆12Updated 3 months ago
- (ICML 2025) Rethinking Chain-of-Thought from the Perspective of Self-Training☆8Updated 5 months ago
- Official codebase for "Analyzing the Generalization and Reliability of Steering Vectors"☆14Updated 7 months ago
- ☆12Updated 6 months ago
- Adapt MLLMs to Domains via Post-Training☆9Updated 6 months ago
- ☆17Updated 7 months ago
- The code for "VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by VIdeo SpatioTemporal Augmentation" [CVPR2025]☆18Updated 4 months ago
- [WIP🚧] 2025 up-to-date list of resources on visual tokenizers (primarily for visual generation). Give it a star 🌟 if you find it useful…☆14Updated 6 months ago
- Official repository of "Inst-IT: Boosting Multimodal Instance Understanding via Explicit Visual Prompt Instruction Tuning"☆32Updated 4 months ago
- A powerful, enterprise-grade multi-agent system for advanced radiological analysis, diagnosis, and treatment planning. This system levera…☆11Updated 2 weeks ago
- 本项目主要是2025届浙江大学软件学院夏令营(AI营)的考核项目☆11Updated 4 months ago
- ☆28Updated last week
- An End-to-End Model with Adaptive Filtering for Retrieval-Augmented Generation☆15Updated 8 months ago