JinjieNi / OpenMoE2Links
The official repo for "OpenMoE 2: Sparse Diffusion Language Models".
β19Updated this week
Alternatives and similar repositories for OpenMoE2
Users that are interested in OpenMoE2 are comparing it to the libraries listed below
Sorting:
- π TPTT: Transforming Pretrained Transformers into Titansβ27Updated last week
- Resa: Transparent Reasoning Models via SAEsβ41Updated last week
- The official repo for βUnleashing the Reasoning Potential of Pre-trained LLMs by Critique Fine-Tuning on One Problemβ [EMNLP25]β32Updated last month
- Code for "From Ideal to Real: Unified and Data-Efficient Dense Prediction for Real-World Scenarios"β27Updated 2 months ago
- Official implementation of Regularized Policy Gradient (RPG) (https://arxiv.org/abs/2505.17508)β40Updated this week
- Official PyTorch implementation of TokenSet.β123Updated 6 months ago
- [ACL2025 Findings] Benchmarking Multihop Multimodal Internet Agentsβ46Updated 7 months ago
- Landing repository for the paper "Predicting the Order of Upcoming Tokens Improves Language Modeling"β33Updated 3 weeks ago
- β26Updated 2 weeks ago
- [NeurIPS 2025 Oral] Exploring Diffusion Transformer Designs via Graftingβ54Updated 3 months ago
- Geometric-Mean Policy Optimizationβ81Updated this week
- β35Updated 8 months ago
- PhysGame Benchmark for Physical Commonsense Evaluation in Gameplay Videosβ45Updated 3 months ago
- β56Updated 2 months ago
- Official implementation of "Reasoning Path Compression: Compressing Generation Trajectories for Efficient LLM Reasoning"β22Updated 4 months ago
- The official github repo for "Diffusion Language Models are Super Data Learners".β116Updated this week
- CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learningβ31Updated last month
- [ICLR 2025] Weighted-Reward Preference Optimization for Implicit Model Fusionβ13Updated 6 months ago
- Code for Paper: Harnessing Webpage Uis For Text Rich Visual Understandingβ53Updated 9 months ago
- An efficient implementation of the NSA (Native Sparse Attention) kernelβ115Updated 3 months ago
- This repo contains the code for "MEGA-Bench Scaling Multimodal Evaluation to over 500 Real-World Tasks" [ICLR2025]β77Updated 3 months ago
- β39Updated 4 months ago
- Official implementation of "PyVision: Agentic Vision with Dynamic Tooling."β127Updated 2 months ago
- OpenVLThinker: An Early Exploration to Vision-Language Reasoning via Iterative Self-Improvementβ112Updated 2 months ago
- Frame Guidance: Training-Free Guidance for Frame-Level Control in Video Diffusion Model (Arxiv 2025)β32Updated 2 months ago
- β24Updated 4 months ago
- β27Updated 3 months ago
- A multimodal agent that can interact with its own PC in a multimodal manner.β34Updated this week
- Implementation of SmoothCache, a project aimed at speeding-up Diffusion Transformer (DiT) based GenAI models with error-guided caching.β45Updated 2 months ago
- β96Updated 3 weeks ago