bigai-nlco / LatentSeekView external linksLinks
Official Repository of LatentSeek
☆76Jun 6, 2025Updated 8 months ago
Alternatives and similar repositories for LatentSeek
Users that are interested in LatentSeek are comparing it to the libraries listed below
Sorting:
- ACL'2025: SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs. and preprint: SoftCoT++: Test-Time Scaling with Soft Chain-of…☆80May 30, 2025Updated 8 months ago
- Official Repo for SvS: A Self-play with Variational Problem Synthesis strategy for RLVR training☆54Dec 13, 2025Updated 2 months ago
- Official codebase for the paper "Reasoning Within the Mind: Dynamic Multimodal Interleaving in Latent Space"☆59Dec 17, 2025Updated 2 months ago
- [CVPR 2025] OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts☆17Apr 2, 2025Updated 10 months ago
- [CVPR 2025] OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts☆21Dec 22, 2025Updated last month
- [NeurIPS'25] The official code of "PeRL: Permutation-Enhanced Reinforcement Learning for Interleaved Vision-Language Reasoning"☆30Jan 12, 2026Updated last month
- ☆18Jun 10, 2025Updated 8 months ago
- Official repository for "CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation"☆68Dec 15, 2025Updated 2 months ago
- Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity☆22Aug 28, 2025Updated 5 months ago
- a fully open-source implementation of a GPT-4o-like speech-to-speech video understanding model.☆37Apr 7, 2025Updated 10 months ago
- TARS: MinMax Token-Adaptive Preference Strategy for Hallucination Reduction in MLLMs☆23Sep 21, 2025Updated 4 months ago
- ☆31Dec 3, 2025Updated 2 months ago
- [EMNLP'22] Weakly-Supervised Temporal Article Grounding☆14Nov 25, 2023Updated 2 years ago
- ☆15Nov 7, 2024Updated last year
- [ACL 2023] VSTAR is a multimodal dialogue dataset with scene and topic transition information☆15Oct 27, 2024Updated last year
- [EMNLP 2025] LightThinker: Thinking Step-by-Step Compression☆132Apr 12, 2025Updated 10 months ago
- Extending context length of visual language models☆12Dec 18, 2024Updated last year
- [NeurIPS 2025] Official Implementation for "Enhancing Vision-Language Model Reliability with Uncertainty-Guided Dropout Decoding"☆22Dec 8, 2024Updated last year
- On Path to Multimodal Generalist: General-Level and General-Bench☆18Jul 11, 2025Updated 7 months ago
- ☆71Jul 28, 2025Updated 6 months ago
- [ACL 2025] Can MLLMs Understand the Deep Implication Behind Chinese Images?☆20Oct 20, 2025Updated 3 months ago
- ☆41Jun 9, 2025Updated 8 months ago
- Official Code for Neural Systematic Binder☆34Mar 27, 2023Updated 2 years ago
- ☆26Aug 21, 2025Updated 5 months ago
- [ICLR 2026] RuleReasoner: Reinforced Rule-based Reasoning via Domain-aware Dynamic Sampling☆31Feb 1, 2026Updated 2 weeks ago
- ☆38Feb 6, 2025Updated last year
- Official resource for paper Investigating and Mitigating the Multimodal Hallucination Snowballing in Large Vision-Language Models (ACL 20…☆15Aug 12, 2024Updated last year
- \infty-Video: A Training-Free Approach to Long Video Understanding via Continuous-Time Memory Consolidation☆19Feb 14, 2025Updated last year
- [ICLR 26] Visual Multi-Agent System: Mitigating Hallucination Snowballing via Visual Flow☆35Oct 3, 2025Updated 4 months ago
- [NeurIPS 2025] More Thinking, Less Seeing? Assessing Amplified Hallucination in Multimodal Reasoning Models☆75May 31, 2025Updated 8 months ago
- (ICLR 2026)Official repository of 'ScaleCap: Inference-Time Scalable Image Captioning via Dual-Modality Debiasing’☆58Jan 26, 2026Updated 3 weeks ago
- [ICML 2025] |TokenSwift: Lossless Acceleration of Ultra Long Sequence Generation☆121May 19, 2025Updated 8 months ago
- [ACM MM 22] Correspondence Matters for Video Referring Expression Comprehension☆15Sep 4, 2022Updated 3 years ago
- Code and data for paper "Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and Mitigation".☆23Oct 22, 2025Updated 3 months ago
- Everything for the Paper: 'Evoke: Evoking Critical Thinking Abilities in LLMs via Reviewer-Author Prompt Editing'☆19Dec 2, 2023Updated 2 years ago
- Official Implementation for Inference-time Scaling of Diffusion Models through Classical Search☆31Oct 8, 2025Updated 4 months ago
- [NeurIPS 2024] | An Efficient Recipe for Long Context Extension via Middle-Focused Positional Encoding☆21Oct 10, 2024Updated last year
- [CVPR 2025] PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models☆51Jun 12, 2025Updated 8 months ago
- Multimodal Music Generation with Explicit Bridges and Retrieval Augmentation: A framework for generating multimodal music by bridging dif…☆28Jan 21, 2025Updated last year