ZrrSkywalker / LLaMA-Adapter
Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters
☆81Updated last year
Related projects: ⓘ
- ☆65Updated last year
- ControlLLM: Augment Language Models with Tools by Searching on Graphs☆184Updated 2 months ago
- Official code for Paper "Mantis: Multi-Image Instruction Tuning"☆158Updated last week
- Touchstone: Evaluating Vision-Language Models by Language Models☆75Updated 8 months ago
- MathVista: data, code, and evaluation for Mathematical Reasoning in Visual Contexts☆219Updated this week
- Recent advancements propelled by large language models (LLMs), encompassing an array of domains including Vision, Audio, Agent, Robotics,…☆107Updated last week
- A minimal codebase for finetuning large multimodal models, supporting llava-1.5/1.6, llava-interleave, llava-next-video, qwen-vl, phi3-v …☆123Updated last week
- E5-V: Universal Embeddings with Multimodal Large Language Models☆148Updated 2 months ago
- ☆111Updated 3 months ago
- Official code for paper "UniIR: Training and Benchmarking Universal Multimodal Information Retrievers" (ECCV 2024)☆92Updated 2 months ago
- InstructionGPT-4☆35Updated 8 months ago
- This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for E…☆323Updated last week
- Democratization of "PaLI: A Jointly-Scaled Multilingual Language-Image Model"☆85Updated 6 months ago
- A curated list of the papers, repositories, tutorials, and anythings related to the large language models for tools☆64Updated last year
- LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis Evaluation☆121Updated 10 months ago
- [TMLR23] Official implementation of UnIVAL: Unified Model for Image, Video, Audio and Language Tasks.☆224Updated 8 months ago
- "Improving Mathematical Reasoning with Process Supervision" by OPENAI☆55Updated last week
- [COLM-2024] List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs☆112Updated 3 weeks ago
- Implementation of the deepmind Flamingo vision-language model, based on Hugging Face language models and ready for training☆163Updated last year
- The huggingface implementation of Fine-grained Late-interaction Multi-modal Retriever.☆64Updated 2 weeks ago
- Implementation of PALI3 from the paper PALI-3 VISION LANGUAGE MODELS: SMALLER, FASTER, STRONGER"☆138Updated last week
- MLLM-Bench: Evaluating Multimodal LLMs with Per-sample Criteria☆49Updated last month
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆123Updated 6 months ago
- FuseAI Project☆75Updated last month
- [CVPR'24] RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback☆218Updated last week
- Towards Large Multimodal Models as Visual Foundation Agents☆87Updated 3 weeks ago
- RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness☆200Updated last week
- Code/Data for the paper: "LLaVAR: Enhanced Visual Instruction Tuning for Text-Rich Image Understanding"☆252Updated 3 months ago
- MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities (ICML 2024)☆252Updated 3 weeks ago
- ☆145Updated 2 months ago