wlll123456 / study_rlhfView external linksLinks
☆80Jul 24, 2025Updated 6 months ago
Alternatives and similar repositories for study_rlhf
Users that are interested in study_rlhf are comparing it to the libraries listed below
Sorting:
- ☆78Jun 19, 2025Updated 7 months ago
- 杭高院自然语言处理课程2023☆26Nov 22, 2023Updated 2 years ago
- 武汉大学国家网络安全学院2021级操作系统期末大实验☆12Jan 2, 2024Updated 2 years ago
- (ACL 2025 Main) Distilling RAG for SLMs from LLMs to Transfer Knowledge and Mitigate Hallucination via Evidence and Graph-based Distillat…☆33Aug 23, 2025Updated 5 months ago
- CS158 Data Structure(Honor) Course Project☆31May 31, 2019Updated 6 years ago
- 这是一个open-r1的复现项目,对0.5B、1.5B、3B、7B的qwen模型进行GRPO训练,观察到一些有趣的现象 。☆56Apr 13, 2025Updated 10 months ago
- 害虫识别☆10Jan 13, 2023Updated 3 years ago
- Transferring Genshin PVs into a freehand style with Diffusion Model.☆10Jun 5, 2024Updated last year
- Documentation at☆14Mar 27, 2025Updated 10 months ago
- Code and Data for ACL 2025 Paper "Aristotle: Mastering Logical Reasoning with A Logic-Complete Decompose-Search-Resolve Framework".☆23Oct 3, 2025Updated 4 months ago
- Official Github Repository for "Spectral-Risk Safe Reinforcement Learning with Convergence Guarantees". (NeurIPS 2024)☆11Nov 30, 2025Updated 2 months ago
- mujoco playground sample☆11Nov 18, 2025Updated 2 months ago
- Official Implementation of "Semantics-Consistent Feature Search for Self-Supervised Visual Representation Learning" in AAAI2024.☆13Feb 28, 2024Updated last year
- Repository for the Findings of ACL'23 paper Label Agnostic Pre-training for Zero-shot Text Classification☆12Aug 10, 2023Updated 2 years ago
- ☆16Jun 10, 2025Updated 8 months ago
- Code space for L4DC paper "State-wise Safe Reinforcement Learning With Pixel Observations"☆12Apr 5, 2024Updated last year
- [ECCV 2024] Official PyTorch implementation of LUT "Learning with Unmasked Tokens Drives Stronger Vision Learners"☆13Dec 1, 2024Updated last year
- Dataset for paper "OmniMotion-X: Versatile Multimodal Whole-Body Motion Generation"☆20Dec 22, 2025Updated last month
- cs336作业 1 实现, 我把 qa 问题也放在飞书链接里面了, 仅供参考☆27Jul 3, 2025Updated 7 months ago
- ☆14Feb 15, 2019Updated 6 years ago
- 武汉大学国家网安院软件安全☆16Dec 9, 2024Updated last year
- ☆10Sep 23, 2020Updated 5 years ago
- ☆13Apr 4, 2025Updated 10 months ago
- The supplementary material for the paper "Fine-tuning Large Language Models to Improve Accuracy and Comprehensibility of Automated Code R…☆16Aug 12, 2024Updated last year
- A2A Concept☆13Apr 10, 2025Updated 10 months ago
- (ACL 2025 Main) Safe: Enhancing Mathematical Reasoning in Large Language Models via Retrospective Step-aware Formal Verification - Offici…☆17Dec 26, 2025Updated last month
- Description for MV-MATH☆15Jul 20, 2025Updated 6 months ago
- 集成Qwen与DeepSeek等先进大语言模型,支持纯LLM+分类层模式及LLM+LoRA+分类层模式,使用transformers模块化设计和训练便于根据需要调整或替换组件。☆19Sep 1, 2025Updated 5 months ago
- From Pattern Recognizers to Personalized Companions: A Survey of Large Language Models in Mental Health☆60Jan 7, 2026Updated last month
- A curated collection of research and techniques for protecting intellectual property of large language models, including watermarking, fi…☆46Updated this week
- [EMNLP 2025] Reasoning-to-Defend: Safety-Aware Reasoning Can Defend Large Language Models from Jailbreaking☆12Aug 22, 2025Updated 5 months ago
- Lightning-responsive CosyVoice2 streaming API based on FastAPI.☆26Dec 6, 2025Updated 2 months ago
- This is the python implementation of "Distance Regularized Level Set Evolution and Its Application to Image Segmentation"☆16Jul 22, 2017Updated 8 years ago
- A lightweight, continuously-updated catalog of research papers on AI agents.☆27Oct 13, 2025Updated 4 months ago
- Official repository for "TrustGeoGen: Formal-Verified Data Engine for Trustworthy Multi-modal Geometric Problem Solving"☆23Sep 1, 2025Updated 5 months ago
- Code for "An Empirical Study of Retrieval Augmented Generation with Chain-of-Thought"☆17Jul 27, 2024Updated last year
- ☆20May 14, 2025Updated 8 months ago
- KMM: Key Frame Mask Mamba for Extended Motion Generation☆19Sep 22, 2025Updated 4 months ago
- [🔥ICCV 2025] SemTalk Holistic Co-speech Motion Generation with Frame-level Semantic Emphasis☆38Dec 30, 2025Updated last month