an method to make vlm think like r1
☆21May 28, 2025Updated last year
Alternatives and similar repositories for deepseek-r1-vision
Users that are interested in deepseek-r1-vision are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Demo for Qwen2.5-VL-3B-Instruct on Axera device.☆16Sep 3, 2025Updated 8 months ago
- Cantonese TTS frontend☆16Oct 14, 2019Updated 6 years ago
- cantonese-mandarin unsupervised neural translation for sw project☆29May 2, 2023Updated 3 years ago
- Galaxea's first diffusion policy release☆37Aug 18, 2025Updated 9 months ago
- Pytorch version of the CVPR 2020 paper: Blindly Assess Image Quality in the Wild Guided by A Self-Adaptive Hyper Network☆12Jul 5, 2020Updated 5 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- [TMLR 25] SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models☆149Oct 10, 2025Updated 7 months ago
- Generative Motion Latent Flow Matching for Audio-driven Talking Portrait☆33Sep 10, 2025Updated 8 months ago
- Convert StyleGAN2 PyTorch to PaddlePaddle☆12Aug 18, 2021Updated 4 years ago
- R1-onevision, a visual language model capable of deep CoT reasoning.☆580Apr 13, 2025Updated last year
- ☆23Jan 3, 2024Updated 2 years ago
- Utilizes ONNX Runtime for TTS model.☆62May 20, 2026Updated last week
- 植物花卉数据集[PlantFlower Datasets]基于RWKV大模型RWKV World模型数据集☆23Jul 6, 2023Updated 2 years ago
- [IJCV 2024] Hard-normal Example-aware Template Mutual Matching for Industrial Anomaly Detection☆24Jan 1, 2025Updated last year
- 使用django+pyecharts+PP-Human开发的动态数据大屏, 有人流数据的采集入库, 打架、摔倒等事件警报,口罩检测等实用功能。边缘端版本使用onnx推理提升效率,服务端版本支持视频流推拉☆33May 3, 2023Updated 3 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆11Sep 22, 2025Updated 8 months ago
- A real-world autonomous driving simulator based on 3D Gaussian Splatting for scene augmentation☆16Jun 10, 2024Updated last year
- paddle code convert toolkit☆22Mar 19, 2023Updated 3 years ago
- Code and website for "GraspMolmo: Generalizable Task-Oriented Grasping via Large-Scale Synthetic Data Generation"☆43Oct 9, 2025Updated 7 months ago
- ANDROID APP that can RECOGNIZE VLC LIVE AUDIO/VIDEO STREAMING (using free Android Developers Speech Recognition API) then TRANSLATE (usin…☆13May 5, 2024Updated 2 years ago
- ☆17Apr 11, 2025Updated last year
- ☆23Oct 3, 2022Updated 3 years ago
- Repo for Polyphone Disambiguation in Mandarin Chinese with Semi-Supervised Learning☆15Feb 26, 2022Updated 4 years ago
- 使用onnxruntime部署GroundingDINO开放世界目标检测,包含C++和Python两个版本的程序☆84Feb 2, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A list of various eye- and head-tracking software, products, etc. ℹ️ This is just a push-mirror. We develop here: https://codeberg.org/ey…☆23Apr 24, 2026Updated last month
- convert 3D point cloud map (.pcd) to 3D occupancy grid map (.pgm)☆12Mar 15, 2024Updated 2 years ago
- 松灵Piper机械臂适配新版Lerobot☆27Jul 22, 2025Updated 10 months ago
- ARPABET transcription syllabifier module☆16Aug 25, 2022Updated 3 years ago
- ☆29Dec 12, 2024Updated last year
- gcc+newlib and gcc+glibc toolchains☆17Apr 12, 2019Updated 7 years ago
- Arabic Grapheme-to-Phoneme (G2P) Conversion☆14Mar 15, 2025Updated last year
- 用koch复现lerobot—遥操作数据采集—act复现—diffusion model复现—Pi模型复现—视觉大模型☆29May 16, 2025Updated last year
- Dify 1.0 Plugin Support MCP Tools Agent strategies☆136Apr 15, 2026Updated last month
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Geometry processing for real-time pencil sketching☆16May 21, 2021Updated 5 years ago
- Ultra-fast, customizable AI voice dictation in any active app on Windows (MacOS and Linux coming soon)☆36Mar 8, 2026Updated 2 months ago
- g2p for english tts☆19Nov 10, 2022Updated 3 years ago
- A chat UI for Llama.cpp☆16May 13, 2026Updated 2 weeks ago
- YOLOv5在高通AI Engine Direct环境下进行QNN量化,CPU推理的项目☆17Sep 10, 2024Updated last year
- Data Dialogue enables natural language querying of databases by integrating LLMs with SQL databases.☆14May 3, 2025Updated last year
- [ACM MM24 Poster] Official implementation of paper "MVPbev: Multi-view Perspective Image Generation from BEV with Test-time Controllabili…☆20Sep 6, 2025Updated 8 months ago