LargeWorldModel / LWM
Large World Model -- Modeling Text and Video with Millions Context
☆7,226Updated 4 months ago
Alternatives and similar repositories for LWM:
Users that are interested in LWM are comparing it to the libraries listed below
- a state-of-the-art-level open visual language model | 多模态预训练模型☆6,353Updated 8 months ago
- Open-Sora: Democratizing Efficient Video Production for All☆23,372Updated this week
- PyTorch native post-training library☆4,856Updated this week
- PyTorch code and models for V-JEPA self-supervised learning from video.☆2,785Updated 6 months ago
- The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.☆8,220Updated 9 months ago
- This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.☆11,893Updated this week
- ☆4,058Updated 8 months ago
- The official PyTorch implementation of Google's Gemma models☆5,349Updated last month
- Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"☆3,242Updated 9 months ago
- The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.☆5,470Updated 6 months ago
- [NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.☆21,480Updated 6 months ago
- Mixture-of-Experts for Large Vision-Language Models☆2,082Updated 2 months ago
- Large Language Model Text Generation Inference☆9,777Updated this week
- Modeling, training, eval, and inference code for OLMo☆5,200Updated this week
- Official implementation code of the paper <AnyText: Multilingual Visual Text Generation And Editing>☆4,521Updated 8 months ago
- Agent framework and applications built upon Qwen>=2.0, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.☆5,852Updated 3 weeks ago
- Open weights LLM from Google DeepMind.☆2,605Updated this week
- Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.☆9,403Updated 7 months ago
- An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)☆4,246Updated 3 weeks ago
- Detect file content types with deep learning☆8,428Updated this week
- Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"☆6,833Updated 8 months ago
- ☆3,423Updated last week
- 【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection☆3,153Updated 2 months ago
- SGLang is a fast serving framework for large language models and vision language models.☆10,325Updated this week
- High-speed Large Language Model Serving for Local Deployment☆8,106Updated this week
- VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and clou…☆2,916Updated last week
- Official repo for VGen: a holistic video generation ecosystem for video generation building on diffusion models☆3,065Updated last month
- InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions☆2,755Updated 3 weeks ago
- 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.☆11,592Updated this week
- Official inference library for Mistral models☆9,991Updated 3 months ago