weleen / awesome-agentLinks
Repository about single/multi-agent, robotics, llm/vlm/vla, scientific discovery, etc.
☆19Updated 7 months ago
Alternatives and similar repositories for awesome-agent
Users that are interested in awesome-agent are comparing it to the libraries listed below
Sorting:
- Parameter-Efficient Fine-Tuning for Foundation Models☆110Updated 10 months ago
- Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters☆92Updated 2 years ago
- [ACL 2024] PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chain☆106Updated last year
- ☆37Updated 2 years ago
- VisualToolAgent (VisTA): A Reinforcement Learning Framework for Visual Tool Selection☆21Updated 8 months ago
- The official repository for the Scientific Paper Idea Proposer (SciPIP)☆67Updated 11 months ago
- ☆110Updated last year
- The open source implementation of "AnyMAL: An Efficient and Scalable Any-Modality Augmented Language Model"☆22Updated last year
- [CVPRW 2024] TrafficVLM: A Controllable Visual Language Model for Traffic Video Captioning. Official code for the 3rd place solution of t…☆51Updated 11 months ago
- ☆19Updated last year
- Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*☆109Updated 8 months ago
- Multimodal Open-O1 (MO1) is designed to enhance the accuracy of inference models by utilizing a novel prompt-based approach. This tool wo…☆29Updated last year
- A Framework for Decoupling and Assessing the Capabilities of VLMs☆43Updated last year
- Build a daily academic subscription pipeline! Get daily Arxiv papers and corresponding chatGPT summaries with pre-defined keywords. It is…☆46Updated 2 years ago
- ☆51Updated 8 months ago
- An benchmark for evaluating the capabilities of large vision-language models (LVLMs)☆46Updated 2 years ago
- (ACL-2025 main conference) Dolphin: Moving Towards Closed-loop Auto-research through Thinking, Practice, and Feedback☆38Updated 7 months ago
- [EMNLP'25] A novel alignment framework that leverages image retrieval to mitigate hallucinations in Vision Language Models.☆50Updated 5 months ago
- [NeurIPS 2025] Elevating Visual Perception in Multimodal LLMs with Visual Embedding Distillation☆70Updated 3 months ago
- DELT: Data Efficacy for Language Model Training☆43Updated 2 weeks ago
- Reproduction of LLaVA-v1.5 based on Llama-3-8b LLM backbone.☆65Updated last year
- This repo contains the source code for VB-LoRA: Extreme Parameter Efficient Fine-Tuning with Vector Banks (NeurIPS 2024).☆42Updated last year
- Latest Advances on Embodied Multimodal LLMs (or Vison-Language-Action Models).☆121Updated last year
- SFT+RL boosts multimodal reasoning☆45Updated 7 months ago
- ☆74Updated 8 months ago
- ☆59Updated 6 months ago
- The official code for paper "EasyGen: Easing Multimodal Generation with a Bidirectional Conditional Diffusion Model and LLMs"☆73Updated last year
- survery of small language models☆18Updated last year
- EfficientVLM: Fast and Accurate Vision-Language Models via Knowledge Distillation and Modal-adaptive Pruning (ACL 2023)☆33Updated 2 years ago
- MLLM @ Game☆16Updated 8 months ago