TencentARC / mllm-npuView external linksLinks
mllm-npu: training multimodal large language models on Ascend NPUs
☆95Aug 29, 2024Updated last year
Alternatives and similar repositories for mllm-npu
Users that are interested in mllm-npu are comparing it to the libraries listed below
Sorting:
- ☆14Nov 19, 2024Updated last year
- ☆17Nov 17, 2023Updated 2 years ago
- [NeurIPS 2023] CircuitFormer: Circuit as Set of Points☆38Nov 22, 2023Updated 2 years ago
- [ICCV 2025] GroundingSuite: Measuring Complex Multi-Granular Pixel Grounding☆73Jun 26, 2025Updated 7 months ago
- [IJCV 2024]☆20Nov 11, 2024Updated last year
- Official codes for ConMIM (ICLR 2023)☆58Feb 8, 2023Updated 3 years ago
- ☆10Dec 16, 2023Updated 2 years ago
- SGLang kernel library for NPU☆96Feb 5, 2026Updated last week
- The code of 'The devil is in the labels: Semantic segmentation from sentences'.☆13Nov 13, 2022Updated 3 years ago
- ☆12Sep 24, 2024Updated last year
- 很好用的tnn classify demo☆11Mar 24, 2021Updated 4 years ago
- ☆58May 13, 2025Updated 9 months ago
- [ACM MM 2024] WeakSAM: Segment Anything Meets Weakly-supervised Instance-level Recognition☆58Apr 8, 2025Updated 10 months ago
- Official github repo of G-LLaVA☆148Feb 20, 2025Updated 11 months ago
- Caffe++: assemble new features to enhance Caffe☕️☆11Dec 24, 2018Updated 7 years ago
- Multimodal Models in Real World☆555Feb 24, 2025Updated 11 months ago
- Source code of ICLR2020 submisstion: Zeno++: Robust Fully Asynchronous SGD☆14Feb 2, 2020Updated 6 years ago
- Structuring Hour-Long Videos into Navigable Chapters and Hierarchical Summaries☆34Nov 19, 2025Updated 2 months ago
- Accelerating Vision-Language Pretraining with Free Language Modeling (CVPR 2023)☆32May 15, 2023Updated 2 years ago
- Convert and run a TF model using Qualcomm SNPE tools☆12Nov 27, 2018Updated 7 years ago
- Official code for "pi-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation", ICML 2023.☆33Jul 21, 2023Updated 2 years ago
- [Archived] For the latest updates and community contribution, please visit: https://github.com/Ascend/TransferQueue or https://gitcode.co…☆13Jan 16, 2026Updated 3 weeks ago
- Description and applications of OpenAI's paper about DALL-E (2021) and implementation of other (CLIP-guided) zero-shot text-to-image gene…☆33Aug 11, 2022Updated 3 years ago
- Official code for the paper, "TaCA: Upgrading Your Visual Foundation Model with Task-agnostic Compatible Adapter".☆16Jun 20, 2023Updated 2 years ago
- ☆11Mar 3, 2020Updated 5 years ago
- Official implementation of T-PAMI25 paper "M²Diffuser: Diffusion-based Trajectory Optimization for Mobile Manipulation in 3D Scenes"☆108Jun 17, 2025Updated 7 months ago
- Codes for ICML 2023 Learning Dynamic Query Combinations for Transformer-based Object Detection and Segmentation☆37Sep 12, 2023Updated 2 years ago
- Official implementation of SEED-LLaMA (ICLR 2024).☆639Sep 21, 2024Updated last year
- SEED-Voken: A Series of Powerful Visual Tokenizers☆992Nov 25, 2025Updated 2 months ago
- imagetokenizer is a python package, helps you encoder visuals and generate visuals token ids from codebook, supports both image and video…☆40Jun 22, 2024Updated last year
- Not All Steps are Created Equal: Selective Diffusion Distillation for Image Manipulation (ICCV 2023)☆66Sep 28, 2023Updated 2 years ago
- [ICLR 2025] IDA-VLM: Towards Movie Understanding via ID-Aware Large Vision-Language Model☆37Nov 27, 2024Updated last year
- the C++ version of thundernet with ncnn☆14Feb 20, 2021Updated 4 years ago
- Official Code for VideoLT: Large-scale Long-tailed Video Recognition (ICCV 2021)☆34Apr 9, 2022Updated 3 years ago
- [IJCV 2025] MIM4D: Masked Modeling with Multi-View Video for Autonomous Driving Representation Learning☆76May 30, 2025Updated 8 months ago
- DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception☆159Dec 6, 2024Updated last year
- The project is based on pytorch and integrates the current mainstream network architecture, including VGGnet, ResNet, Densenet, MobileNet…☆16Dec 24, 2022Updated 3 years ago
- Featurized Query R-CNN☆45Jun 17, 2022Updated 3 years ago
- Bi-Directional Attention for Joint Instance and Semantic Segmentation in Point Clouds(BAN)☆16Apr 27, 2021Updated 4 years ago