视界之声(VisionVoice)是一款为视障人士打造的无障碍智能助手😎,通过图像识别、特征匹配、测距等技术,实现相册阅读、环境描述、寻物、避障、知识问答和情感交流等功能✨,助力视障人士拥抱多彩世界💖!
☆26Aug 29, 2025Updated 6 months ago
Alternatives and similar repositories for VisionVoice
Users that are interested in VisionVoice are comparing it to the libraries listed below
Sorting:
- YOLOv5 Quantization Aware Training (QAT, qat_torch branch) and Post Training Quantization with ONNX (ptq_onnx branch ptq_onnx.ipynb)☆15Feb 28, 2023Updated 3 years ago
- 自动评教(西电专版)☆13Feb 2, 2023Updated 3 years ago
- IMG: Calibrating Diffusion Models via Implicit Multimodal Guidance, ICCV 2025☆30Oct 1, 2025Updated 5 months ago
- 西电生活指南与计算机网信课程攻略开源共享项目——从入学到实习工作/升学的全方位指南,给各位西电的萌新一点小小的西电震撼!☆18Feb 14, 2026Updated 2 weeks ago
- Multimodal Transformer for Predicting Global Minimum Adsorption Energy☆27Apr 5, 2025Updated 10 months ago
- MDPO: Overcoming the Training-Inference Divide of Masked Diffusion Language Models☆40Jan 28, 2026Updated last month
- A workflow to create computation-ready metal-organic framework database.☆32Oct 9, 2025Updated 4 months ago
- ☆24Apr 4, 2024Updated last year
- Instruct-CLIP: Improving Instruction-Guided Image Editing with Automated Data Refinement Using Contrastive Learning (CVPR 2025)☆33Jun 10, 2025Updated 8 months ago
- [ICCV 2025] 2D version of Dense Policy☆32Jan 14, 2026Updated last month
- [CVPR 2025] The official implementation of "CacheQuant: Comprehensively Accelerated Diffusion Models"☆44Nov 2, 2025Updated 4 months ago
- [ICML'25] Official code of paper "Fast Large Language Model Collaborative Decoding via Speculation"☆28Jun 23, 2025Updated 8 months ago
- Codes accompanying the paper "Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment"☆36Feb 11, 2025Updated last year
- [ICCV 2025] Multi-Granular Spatio-Temporal Token Merging for Training-Free Acceleration of Video LLMs☆57Feb 2, 2026Updated last month
- [NeurIPS 2025] Official code for paper: Beyond Attention or Similarity: Maximizing Conditional Diversity for Token Pruning in MLLMs.☆87Sep 20, 2025Updated 5 months ago
- [ICCV 2025] Official code for "AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruning"☆55Oct 9, 2025Updated 4 months ago
- Curated list of methods that focuses on improving the efficiency of diffusion models☆44Jul 9, 2024Updated last year
- [RA-L 2025 & ICRA 2026] Motion Before Action: Diffusing Object Motion as Manipulation Condition☆68Nov 4, 2025Updated 4 months ago
- [ICCV 2025] Dense Policy: Bidirectional Autoregressive Learning of Actions DSP☆73Jan 14, 2026Updated last month
- Official Site for ManiFoundation Model☆58May 14, 2024Updated last year
- [CVPR2025] Official implementation of High Fidelity Scene Text Synthesis.☆79Mar 24, 2025Updated 11 months ago
- ☆70Feb 27, 2024Updated 2 years ago
- Code for FLIP: Flow-Centric Generative Planning for General-Purpose Manipulation Tasks☆79Dec 12, 2024Updated last year
- [ICLR 2026] Autoregressive Image Generation with Randomized Parallel Decoding☆88Feb 16, 2026Updated 2 weeks ago
- Survey and paper list on efficiency-guided LLM agents (memory, tool learning, planning).☆178Feb 9, 2026Updated 3 weeks ago
- [ICLR2025] This repository is the official implementation of our Autoregressive Pretraining with Mamba in Vision☆90May 30, 2025Updated 9 months ago
- [ICLR 2026] Official repository of "Beyond Fixed: Training-Free Variable-Length Denoising for Diffusion Large Language Models"☆162Feb 16, 2026Updated 2 weeks ago
- The Official Implementation of Ada-KV [NeurIPS 2025]☆128Nov 26, 2025Updated 3 months ago
- [CVPR 2025] Official implementation of the paper "Generative Inbetweening through Frame-wise Conditions-Driven Video Generation"☆115Feb 27, 2025Updated last year
- [CVPR 2025] CoDe: Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient☆108Sep 27, 2025Updated 5 months ago
- [Neurips 24' D&B] Official Dataloader and Evaluation Scripts for LongVideoBench.☆113Jul 27, 2024Updated last year
- ☆125Feb 28, 2025Updated last year
- Official code for K-LoRA (CVPR 2025)☆140Sep 27, 2025Updated 5 months ago
- 用于下载西安电子科技大学录直播平台课程视频的工具☆147Feb 10, 2026Updated 3 weeks ago
- This is the official implementation for ControlVAR.☆126Dec 10, 2024Updated last year
- Discrete Diffusion Forcing (D2F): dLLMs Can Do Faster-Than-AR Inference☆241Feb 3, 2026Updated last month
- Structured Video Comprehension of Real-World Shorts☆231Sep 21, 2025Updated 5 months ago
- [CVPR 2025 Highlight] TinyFusion: Diffusion Transformers Learned Shallow☆160Dec 1, 2025Updated 3 months ago
- Official Implementation for Diffusion Models Without Classifier-free Guidance☆171Feb 18, 2025Updated last year