thunlp / LEGENT
Open Platform for Embodied Agents
β263Updated 3 weeks ago
Related projects β
Alternatives and complementary repositories for LEGENT
- Official Repo for Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learningβ199Updated last month
- πOctopus, an embodied vision-language model trained with RLEF, emerging superior in embodied visual planning and programming.β263Updated 5 months ago
- [ICLR 2024] Source codes for the paper "Building Cooperative Embodied Agents Modularly with Large Language Models"β223Updated 2 weeks ago
- GRUtopia: Dream General Robots in a City at Scaleβ503Updated 2 months ago
- Align Anything: Training All-modality Model with Feedbackβ220Updated this week
- Towards Large Multimodal Models as Visual Foundation Agentsβ113Updated last week
- Code and implementations for the paper "AgentGym: Evolving Large Language Model-based Agents across Diverse Environments" by Zhiheng Xi eβ¦β346Updated last month
- π» A curated list of papers and resources for multi-modal Graphical User Interface (GUI) agents.β175Updated last week
- RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthinessβ230Updated this week
- Official repo for paper DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning.β252Updated last month
- [CVPR2024] This is the official implement of MP5β83Updated 4 months ago
- OpenEQA Embodied Question Answering in the Era of Foundation Modelsβ233Updated last month
- [arXiv 2023] Embodied Task Planning with Large Language Modelsβ155Updated last year
- A curated list of awesome papers on Embodied AI and related research/industry-driven resources.β282Updated 3 months ago
- β85Updated 3 months ago
- Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Modelβ333Updated 4 months ago
- [NeurIPS 2023 Datasets and Benchmarks Track] LAMM: Multi-Modal Large Language Models and Applications as AI Agentsβ300Updated 6 months ago
- (ECCV 2024) Code for V-IRL: Grounding Virtual Intelligence in Real Lifeβ313Updated 3 months ago
- The model, data and code for the visual GUI Agent SeeClickβ215Updated 2 months ago
- The official repo for "SpatialBot: Precise Spatial Understanding with Vision Language Models.β158Updated 3 weeks ago
- β339Updated last month
- Paper collections of the continuous effort start from World Models.β130Updated 4 months ago
- β40Updated 10 months ago
- [ACL 2024] PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chainβ100Updated 7 months ago
- Official implementation for "You Only Look at Screens: Multimodal Chain-of-Action Agents" (Findings of ACL 2024)β196Updated 3 months ago
- [ICML 2024] Official code repository for 3D embodied generalist agent LEOβ363Updated 3 weeks ago
- [ICML 2024] 3D-VLA: A 3D Vision-Language-Action Generative World Modelβ341Updated last week
- Compose multimodal datasets πΉβ204Updated this week
- β152Updated 4 months ago
- Code for MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D Worldβ121Updated 2 weeks ago