Genesis-Embodied-AI / GenesisLinks
A generative world for general-purpose robotics & embodied AI learning.
β25,217Updated this week
Alternatives and similar repositories for Genesis
Users that are interested in Genesis are comparing it to the libraries listed below
Sorting:
- OpenVLA: An open-source vision-language-action model for robotic manipulation.β2,901Updated 2 months ago
- Finetune Qwen3, Llama 4, TTS, DeepSeek-R1 & Gemma 3 LLMs 2x faster with 70% less memory! π¦₯β39,895Updated this week
- Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"β6,829Updated 2 months ago
- The Large-scale Manipulation Platform for Scalable and Intelligent Embodied Systemsβ2,076Updated this week
- β3,529Updated this week
- π€ LeRobot: Making AI for Robotics more accessible with end-to-end learningβ13,972Updated this week
- The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained modeβ¦β15,641Updated 5 months ago
- tiny vision language modelβ8,040Updated this week
- NVIDIA Isaac GR00T N1 is the world's first open foundation model for generalized humanoid robot reasoning and skills.β4,035Updated this week
- An open source implementation of CLIP.β11,888Updated this week
- Unified framework for robot learning built on NVIDIA Isaac Simβ3,740Updated this week
- A simple screen parsing tool towards pure vision based GUI agentβ22,339Updated 2 months ago
- New repo collection for NVIDIA Cosmos: https://github.com/nvidia-cosmosβ7,999Updated last month
- Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagβ¦β23,646Updated this week
- MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phoneβ19,527Updated last week
- Official inference framework for 1-bit LLMsβ19,979Updated this week
- Transformer Explained Visually: Learn How LLM Transformer Models Work with Interactive Visualizationβ4,521Updated this week
- Open-Sora: Democratizing Efficient Video Production for Allβ26,573Updated last month
- π€ Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.β29,235Updated this week
- Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audiβ¦β8,356Updated last week
- A high-throughput and memory-efficient inference and serving engine for LLMsβ48,865Updated this week
- β1,912Updated 5 months ago
- [RSS 2023] Diffusion Policy Visuomotor Policy Learning via Action Diffusionβ2,606Updated 5 months ago
- SpatialLM: Large Language Model for Spatial Understandingβ3,229Updated 2 months ago
- Easily fine-tune, evaluate and deploy Qwen3, DeepSeek-R1, Llama 4 or any open source LLM / VLM!β8,156Updated this week
- LLaVA-CoT, a visual language model capable of spontaneous, systematic reasoningβ2,000Updated 3 weeks ago
- Agent Laboratory is an end-to-end autonomous research workflow meant to assist you as the human researcher toward implementing your reseaβ¦β4,468Updated 2 months ago
- The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, No-code agent builder, MCP compatibility, and more.β44,780Updated this week
- π₯ Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.β39,178Updated this week
- Build and share delightful machine learning apps, all in Python. π Star to support our work!β38,420Updated this week