Our 2nd-gen LMM
☆34May 22, 2024Updated last year
Alternatives and similar repositories for 360VL
Users that are interested in 360VL are comparing it to the libraries listed below
Sorting:
- LMM solved catastrophic forgetting, AAAI2025☆46Apr 15, 2025Updated 10 months ago
- DeepTrace: A lightweight, scalable real-time diagnostic and analysis tool for distributed training tasks.☆18Nov 4, 2025Updated 3 months ago
- Fast LLM Training CodeBase With dynamic strategy choosing [Deepspeed+Megatron+FlashAttention+CudaFusionKernel+Compiler];☆40Jan 4, 2024Updated 2 years ago
- ☆14Jul 5, 2024Updated last year
- ☆30Aug 21, 2025Updated 6 months ago
- ☆15Jun 20, 2024Updated last year
- Multimodal chatbot with computer vision capabilities integrated, our 1st-gen LMM☆101May 17, 2024Updated last year
- The Next Step Forward in Multimodal LLM Alignment☆197May 1, 2025Updated 10 months ago
- [NAACL 2025] Representing Rule-based Chatbots with Transformers☆23Feb 9, 2025Updated last year
- Chinese-native image generation while compatible with SD eco-system, 1st-gen, AAAI2025☆13Jun 25, 2024Updated last year
- Web application for real-time object detection 🔎 using Flask 🌶, OpenCV, and YoloV3 weights. It uses the COCO Dataset 🖼.☆16Apr 19, 2021Updated 4 years ago
- Official implementation of the paper "Bind-Your-Avatar: Multi-Talking-Character Video Generation with Dynamic 3D-mask-based Embedding Rou…☆34Sep 25, 2025Updated 5 months ago
- ☆20Jan 6, 2023Updated 3 years ago
- [ICIP 2025] Scribble-Guided Diffusion for Training-free Text-to-Image Generation☆24Oct 2, 2024Updated last year
- ☆22Oct 21, 2024Updated last year
- Chinese CLIP models with SOTA performance.☆60Aug 28, 2023Updated 2 years ago
- Reproduction of the complete process of DeepSeek-R1 on small-scale models, including Pre-training, SFT, and RL.☆29Mar 11, 2025Updated 11 months ago
- DEYOv1.5☆29Jul 22, 2024Updated last year
- [NeurIPS 2025] Elevating Visual Perception in Multimodal LLMs with Visual Embedding Distillation☆71Oct 17, 2025Updated 4 months ago
- [AAAI 2026] Test-Time Reinforcement Learning for GUI Grounding via Region Consistency https://arxiv.org/abs/2508.05615☆61Nov 8, 2025Updated 3 months ago
- Scaling Multi-modal Instruction Fine-tuning with Tens of Thousands Vision Task Types☆32Jul 16, 2025Updated 7 months ago
- 本项目使用LLaVA 1.6多模态模型实现以文搜图和以图搜图功能。☆28Feb 26, 2024Updated 2 years ago
- ☆190Feb 5, 2026Updated 3 weeks ago
- Codebase for Instruction Following without Instruction Tuning☆36Sep 24, 2024Updated last year
- ☆21Dec 14, 2025Updated 2 months ago
- This is the code repo for our paper "Benchmarking Retrieval-Augmented Generation in Multi-Modal Contexts".☆45Sep 27, 2025Updated 5 months ago
- baichuan and baichuan2 finetuning and alpaca finetuning☆33Mar 10, 2025Updated 11 months ago
- A Toolkit for Running On-device Large Language Models (LLMs) in APP☆82Jul 4, 2024Updated last year
- [NeurIPS'25 Spotlight] Official implementation of "JavisGPT: A Unified Multi-modal LLM for Sounding-Video Comprehension and Generation"☆69Updated this week
- The official PyTorch implementation for Improving Long-Text Alignment for Text-to-Image Diffusion Models (LongAlign)☆80Apr 23, 2025Updated 10 months ago
- GLM Series Edge Models☆157Jun 12, 2025Updated 8 months ago
- A free and open-source focus stacking software that supports multi-focus image alignment and fusion.☆19Feb 5, 2026Updated 3 weeks ago
- Defeating the Training-Inference Mismatch via FP16☆182Nov 14, 2025Updated 3 months ago
- Counting-Stars (★)☆83Nov 24, 2025Updated 3 months ago
- ☆89May 21, 2025Updated 9 months ago
- Finetune Llama 3, Mistral & Gemma LLMs 2-5x faster with 80% less memory☆29May 10, 2024Updated last year
- ☆83Apr 3, 2025Updated 10 months ago
- ☆58Oct 19, 2025Updated 4 months ago
- ☆111Jan 8, 2025Updated last year