对llava官方代码的一些学习笔记
☆29Oct 11, 2024Updated last year
Alternatives and similar repositories for llava-handbook
Users that are interested in llava-handbook are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [NeurIPS2024] Repo for the paper `ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models'☆206Jul 17, 2025Updated 8 months ago
- 📖Curated list about reasoning abilitiy of MLLM, including OpenAI o1, OpenAI o3-mini, and Slow-Thinking.☆13Feb 7, 2025Updated last year
- Get CLIP ViT text tokens about an image, visualize attention as a heatmap.☆15Aug 8, 2023Updated 2 years ago
- 使用rasa构建任务型聊天机器人☆13Dec 8, 2022Updated 3 years ago
- 训练一个对中文支持更好的LLaVA模型,并开源训练代码和数据。☆81Sep 6, 2024Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆22Sep 5, 2025Updated 6 months ago
- Codes of PostEdit☆23Apr 28, 2025Updated 11 months ago
- MSWAL☆14Nov 7, 2025Updated 4 months ago
- Team FDVTS_DR's solutions for MICCAI2022 Diabetic Retinopathy Analysis Challenge (DRAC)☆15Mar 5, 2024Updated 2 years ago
- Python script to download conference paper automatically☆16Sep 10, 2024Updated last year
- The code for ACM MM2024 (Multimodal Unlearnable Examples: Protecting Data against Multimodal Contrastive Learning)☆15Jul 18, 2024Updated last year
- ☆17Feb 25, 2023Updated 3 years ago
- Implementation of Spectral Leakage and Rethinking the Kernel Size in CNNs in Pytorch☆14Feb 3, 2021Updated 5 years ago
- This is an official repository of ``VLAttack: Multimodal Adversarial Attacks on Vision-Language Tasks via Pre-trained Models'' (NeurIPS 2…☆66Mar 22, 2025Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- AdvAD: Exploring Non-Parametric Diffusion for Imperceptible Adversarial Attacks☆19May 12, 2025Updated 10 months ago
- Explorations into the proposed SDFT, Self-Distillation Enables Continual Learning, from Shenfeld et al. of MIT☆30Feb 6, 2026Updated last month
- [NAACL 2024] Z-GMOT: Zero-shot Generic Multiple Object Tracking☆13May 3, 2024Updated last year
- Multi-Person Tracking in Tour Guide Robot☆10Aug 23, 2022Updated 3 years ago
- 使用langraph构建Agentic-RAG☆22Jul 30, 2025Updated 8 months ago
- This repository is an official implementation of the paper A Simple Baseline for Open-World Tracking via Self-training.☆10Jan 26, 2024Updated 2 years ago
- [ACL 25] SafeChain: Safety of Language Models with Long Chain-of-Thought Reasoning Capabilities☆29Apr 2, 2025Updated 11 months ago
- Code for 'Why is Winoground Hard? Investigating Failures in Visuolinguistic Compositionality', EMNLP 2022☆31May 29, 2023Updated 2 years ago
- Papers of "A Survey on Multimodal LLMs from the Perspective of Input-Output Space Extension"☆17Feb 4, 2026Updated last month
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆18Mar 10, 2026Updated 3 weeks ago
- Detection of LLM-Generated Codes [ICSE2025]☆32Jul 5, 2025Updated 8 months ago
- ☆14Apr 20, 2020Updated 5 years ago
- PKU-I2IQA: An Image-to-Image Quality Assessment Database for AI Generated Images☆16Dec 4, 2024Updated last year
- ☆34Jun 22, 2023Updated 2 years ago
- Official code for EnvSDD (Environmental Sound Deepfake Detection)☆31Dec 13, 2025Updated 3 months ago
- Fast instruction tuning with Llama2☆11Apr 8, 2024Updated last year
- [NeurIPS 2025 D&B] BackdoorDM: A Comprehensive Benchmark for Backdoor Learning in Diffusion Model☆26Aug 1, 2025Updated 7 months ago
- Official repo for NeurIPS'24 paper "WAGLE: Strategic Weight Attribution for Effective and Modular Unlearning in Large Language Models"☆19Dec 16, 2024Updated last year
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- An unofficial pytorch implementation of the BiHDM model proposed by Yang et al. for decoding emotion from multi-channel EEG recordings, w…☆15Apr 6, 2023Updated 2 years ago
- Official code for "What Makes for Good Visual Tokenizers for Large Language Models?".☆59Jun 27, 2023Updated 2 years ago
- Official implementation of deep-multi-trajectory-based single object tracking (IEEE T-CSVT 2021).☆12Aug 24, 2022Updated 3 years ago
- Convert .vox to .obj☆14Nov 24, 2018Updated 7 years ago
- ☆12Mar 27, 2025Updated last year
- [COLING 2025 Industry] LoRA Soups☆19Nov 29, 2024Updated last year
- Official Repository for CVPR 2024 Paper: "Large Language Models are Good Prompt Learners for Low-Shot Image Classification"☆42Jul 1, 2024Updated last year