OpenDocCN / python-code-anls
☆32Updated 2 weeks ago
Alternatives and similar repositories for python-code-anls:
Users that are interested in python-code-anls are comparing it to the libraries listed below
- DeepSpeed Tutorial☆94Updated 6 months ago
- pytorch单精度、半精度、混合精度、单卡、多卡(DP / DDP)、FSDP、DeepSpeed模型训练代码,并对比不同方法的训练速度以及GPU内存的使用☆87Updated 11 months ago
- LLaVA combines with Magvit Image tokenizer, training MLLM without an Vision Encoder. Unifying image understanding and generation.☆35Updated 7 months ago
- [NeurIPS2023] Parameter-efficient Tuning of Large-scale Multimodal Foundation Model☆85Updated last year
- Multimodal Open-O1 (MO1) is designed to enhance the accuracy of inference models by utilizing a novel prompt-based approach. This tool wo…☆29Updated 4 months ago
- 主要记录大语言大模型(LLMs) 算法(应用)工程师多模态相关知识☆121Updated 9 months ago
- Train InternViT-6B in MMSegmentation and MMDetection with DeepSpeed☆77Updated 3 months ago
- 这是一个DiT-pytorch的代码,主要用于学习DiT结构。☆72Updated 11 months ago
- Implementation of ViTaR: ViTAR: Vision Transformer with Any Resolution in PyTorch☆30Updated 3 months ago
- The official implementation of RAR☆81Updated 10 months ago
- Pink: Unveiling the Power of Referential Comprehension for Multi-modal LLMs☆87Updated last month
- ☆92Updated 7 months ago
- The official repo for “TextCoT: Zoom In for Enhanced Multimodal Text-Rich Image Understanding”.☆38Updated 4 months ago
- Video dataset dedicated to portrait-mode video recognition.☆44Updated 2 months ago
- [BMVC 2024] PlainMamba: Improving Non-hierarchical Mamba in Visual Recognition☆72Updated 6 months ago
- 多模态 MM +Chat 合集☆240Updated this week
- [ICML 2024] Official PyTorch implementation of "SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-paramete…☆88Updated 5 months ago
- [MM2024, oral] "Self-Supervised Visual Preference Alignment" https://arxiv.org/abs/2404.10501☆49Updated 6 months ago
- Building a VLM model starts from the basic module.☆12Updated 10 months ago
- This repository contains the pytorch code for our work IEEE ISBI 2024 paper "ConvLoRA and AdaBN Based Domain Adaptation via Self-Training…☆65Updated 4 months ago
- The official implementation of "Adapter is All You Need for Tuning Visual Tasks".☆77Updated 5 months ago
- [ECCV 2024] SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding☆49Updated 3 months ago
- ☆64Updated 3 months ago
- ☆27Updated last month
- Build a daily academic subscription pipeline! Get daily Arxiv papers and corresponding chatGPT summaries with pre-defined keywords. It is…☆32Updated last year
- A paper list of self-supervised pretrain method☆17Updated last month
- ☆20Updated 5 months ago
- Explore the Limits of Omni-modal Pretraining at Scale☆96Updated 5 months ago
- The official implementation of the paper "MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding". …☆46Updated 3 months ago