mllm-npu: training multimodal large language models on Ascend NPUs
☆95Aug 29, 2024Updated last year
Alternatives and similar repositories for mllm-npu
Users that are interested in mllm-npu are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆10Dec 16, 2023Updated 2 years ago
- [ICCV 2025] GroundingSuite: Measuring Complex Multi-Granular Pixel Grounding☆77Jun 26, 2025Updated 11 months ago
- The first decoder-only multimodal state space model☆104May 19, 2025Updated last year
- [NeurIPS 2023] CircuitFormer: Circuit as Set of Points☆38Nov 22, 2023Updated 2 years ago
- [IJCV 2024]☆21Nov 11, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- EVE Series: Encoder-Free Vision-Language Models from BAAI☆369Jul 24, 2025Updated 10 months ago
- [ACM MM 2024] WeakSAM: Segment Anything Meets Weakly-supervised Instance-level Recognition☆58Apr 8, 2025Updated last year
- [Archived] For the latest updates and community contribution, please visit: https://github.com/Ascend/TransferQueue or https://gitcode.co…☆15Jan 16, 2026Updated 4 months ago
- 🚀 Collection of libraries used with fms-hf-tuning to accelerate fine-tuning and training of large models.☆14Jan 30, 2026Updated 3 months ago
- Structuring Hour-Long Videos into Navigable Chapters and Hierarchical Summaries☆42Nov 19, 2025Updated 6 months ago
- Official codes for ConMIM (ICLR 2023)☆59Feb 8, 2023Updated 3 years ago
- SGLang kernel library for NPU☆137Updated this week
- 很好用的tnn classify demo☆11Mar 24, 2021Updated 5 years ago
- Description and applications of OpenAI's paper about DALL-E (2021) and implementation of other (CLIP-guided) zero-shot text-to-image gene…☆33Aug 11, 2022Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- [Arxiv2022] Revitalize Region Feature for Democratizing Video-Language Pre-training☆22Mar 19, 2022Updated 4 years ago
- [ECCV 2024] Occupancy as Set of Points☆93Jul 8, 2024Updated last year
- Code for "AVG-LLaVA: A Multimodal Large Model with Adaptive Visual Granularity"☆33Oct 12, 2024Updated last year
- Official implementation of T-PAMI25 paper "M²Diffuser: Diffusion-based Trajectory Optimization for Mobile Manipulation in 3D Scenes"☆118Jun 17, 2025Updated 11 months ago
- ☆24Aug 17, 2024Updated last year
- SEED-Voken: A Series of Powerful Visual Tokenizers☆1,008Nov 25, 2025Updated 6 months ago
- ☆21Feb 29, 2024Updated 2 years ago
- Accelerating Vision-Language Pretraining with Free Language Modeling (CVPR 2023)☆33May 15, 2023Updated 3 years ago
- The Pytorch implementation for "Video-Text Pre-training with Learned Regions"☆43Jul 15, 2022Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆12Sep 24, 2024Updated last year
- Official github repo of G-LLaVA☆149Feb 20, 2025Updated last year
- Official code for the paper, "TaCA: Upgrading Your Visual Foundation Model with Task-agnostic Compatible Adapter".☆17Jun 20, 2023Updated 2 years ago
- Official code for "What Makes for Good Visual Tokenizers for Large Language Models?".☆60Jun 27, 2023Updated 2 years ago
- OmniMamba: Efficient and Unified Multimodal Understanding and Generation via State Space Models☆123Apr 25, 2025Updated last year
- The code of 'The devil is in the labels: Semantic segmentation from sentences'.☆13Nov 13, 2022Updated 3 years ago
- This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"☆37Jul 11, 2024Updated last year
- [IJCV 2025] MIM4D: Masked Modeling with Multi-View Video for Autonomous Driving Representation Learning☆78May 30, 2025Updated 11 months ago
- [arXiv '24] Efficient Cell Nuclei Instance Segmentation with Large Convolution Kernels☆48Aug 28, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception☆159Dec 6, 2024Updated last year
- A RLHF Infrastructure for Vision-Language Models☆199Nov 15, 2024Updated last year
- Structured Video Comprehension of Real-World Shorts☆237Sep 21, 2025Updated 8 months ago
- imagetokenizer is a python package, helps you encoder visuals and generate visuals token ids from codebook, supports both image and video…☆40Jun 22, 2024Updated last year
- ☆27Jun 17, 2022Updated 3 years ago
- ncnn HiFi-GAN☆30Sep 29, 2024Updated last year
- 使用yolov8自动标注,运用度量学习metric learning 的ReID算法,实现跨镜头人脸追踪☆10May 15, 2024Updated 2 years ago