PKU-Alignment / safe-soraLinks

SafeSora is a human preference dataset designed to support safety alignment research in the text-to-video generation field, aiming to enhance the helpfulness and harmlessness of Large Vision Models (LVMs).

☆33

Alternatives and similar repositories for safe-sora

Users that are interested in safe-sora are comparing it to the libraries listed below

Sorting:

OpenGVLab / PhyGenBench
[ICML2025] The code and data of Paper: Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation
☆124Updated 11 months ago
facebookresearch / metamorph
Code for MetaMorph Multimodal Understanding and Generation via Instruction Tuning
☆212Updated 5 months ago
ML-GSAI / LLaDA-V
☆237Updated 2 weeks ago
rongyaofang / PUMA
Empowering Unified MLLM with Multi-granular Visual Generation
☆130Updated 8 months ago
Shentao-YANG / Dense_Reward_T2I
Source code for "A Dense Reward View on Aligning Text-to-Image Diffusion with Preference" (ICML'24).
☆39Updated last year
rese1f / aurora
[ICLR 2025] AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark
☆128Updated 4 months ago
TencentARC / SEED-Bench-R1
☆90Updated 3 months ago
Haochen-Wang409 / ross
[ICLR'25] Reconstructive Visual Instruction Tuning
☆119Updated 6 months ago
GAIR-NLP / thinking-with-generated-images
Doodling our way to AGI ✏️ 🖼️ 🧠
☆105Updated 4 months ago
si0wang / VisVM
☆45Updated 9 months ago
zhijie-group / Orthus
☆60Updated 4 months ago
PKU-YuanGroup / WISE
WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation
☆152Updated last week
zhyang2226 / OPA-DPO
[CVPR 2025 (Oral)] Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the Key
☆81Updated last week
mit-han-lab / vila-u
[ICLR 2025] VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation
☆389Updated 5 months ago
aim-uofa / dLLM-MidTruth
☆52Updated last month
selftok-team / SelftokTokenizer
Selftok: Discrete Visual Tokens of Autoregression, by Diffusion, and for Reasoning
☆219Updated 4 months ago
ML-GSAI / Diffusion-LLM-Papers
A Collection of Papers on Diffusion Language Models
☆131Updated 3 weeks ago
Open-Reasoner-Zero / Open-Vision-Reasoner
The official repository for our paper, "Open Vision Reasoner: Transferring Linguistic Cognitive Behavior for Visual Reasoning".
☆139Updated 3 weeks ago
chenllliang / G1
G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning
☆85Updated 4 months ago
ziqipang / RandAR
[CVPR 2025 (Oral)] Open implementation of "RandAR"
☆196Updated 2 months ago
MonoFormer / MonoFormer
The official implementation for "MonoFormer: One Transformer for Both Diffusion and Autoregression"
☆86Updated 11 months ago
Fr0zenCrane / UniCoT
Uni-CoT: Towards Unified Chain-of-Thought Reasoning Across Text and Vision
☆146Updated 2 weeks ago
dvlab-research / Prompt-Highlighter
[CVPR 2024] Prompt Highlighter: Interactive Control for Multi-Modal LLMs
☆153Updated last year
TencentARC / GRPO-CARE
☆75Updated 3 months ago
Chenyu-Wang567 / MLLM-Tool
MLLM-Tool: A Multimodal Large Language Model For Tool Agent Learning
☆132Updated last year
PKU-YuanGroup / AsFT
Code for the paper "AsFT: Anchoring Safety During LLM Fune-Tuning Within Narrow Safety Basin".
☆29Updated 3 months ago
pipilurj / bootstrapped-preference-optimization-BPO
code for "Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization"
☆59Updated last year
PKU-YuanGroup / Video-Bench
A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models!
☆133Updated last year
showlab / UniRL
The code repository of UniRL
☆41Updated 4 months ago
OpenSparseLLMs / Skip-DiT
✈️ [ICCV 2025] Towards Stabilized and Efficient Diffusion Transformers through Long-Skip-Connections with Spectral Constraints
☆75Updated 3 months ago