☆86Jul 24, 2025Updated 7 months ago
Alternatives and similar repositories for study_rlhf
Users that are interested in study_rlhf are comparing it to the libraries listed below
Sorting:
- ☆87Jun 19, 2025Updated 8 months ago
- 杭高院自然语言处理课程2023☆26Nov 22, 2023Updated 2 years ago
- (ACL 2025 Main) Distilling RAG for SLMs from LLMs to Transfer Knowledge and Mitigate Hallucination via Evidence and Graph-based Distillat…☆33Aug 23, 2025Updated 6 months ago
- CS158 Data Structure(Honor) Course Project☆32May 31, 2019Updated 6 years ago
- [AAAI 2025] Neural-Symbolic Collaborative Distillation: Advancing Small Language Models for Complex Reasoning Tasks☆11Jun 19, 2025Updated 8 months ago
- Documentation at☆14Mar 27, 2025Updated 11 months ago
- 大学Latex答辩模版,当前包含川大、哈工大、中科大。☆10Jul 22, 2024Updated last year
- Source codes for the paper "Personalized Dynamic Music Emotion Recognition with Dual-Scale Attention-Based Meta-Learning" (PDMER) which p…☆14Mar 24, 2025Updated 11 months ago
- WeKnora‑pro是基于原始 WeKnora 的二次开发版本,核心在于提升文档解析能力。 主要改进:1. 支持扫描件通过 (CPU/GPU 自动优化)进行 OCR 与表格提取;且兼容WeKnora多模态增加 2. 文档大小上限提升至 300 MB☆43Oct 29, 2025Updated 4 months ago
- ☆11Mar 5, 2024Updated 2 years ago
- [🔥ACM MM2025] EchoMask: Speech-Queried Attention-based Mask Modeling for Holistic Co-Speech Motion Generation☆23Dec 30, 2025Updated 2 months ago
- mujoco playground sample☆11Nov 18, 2025Updated 3 months ago
- Official Implementation of "Semantics-Consistent Feature Search for Self-Supervised Visual Representation Learning" in AAAI2024.☆13Feb 28, 2024Updated 2 years ago
- Dataset for paper "OmniMotion-X: Versatile Multimodal Whole-Body Motion Generation"☆20Dec 22, 2025Updated 2 months ago
- A pipeline for the automatic construction of geometry problems along with step-by-step solutions.☆17Aug 27, 2025Updated 6 months ago
- ⚙️Aiops Platform Support Both Cloud and On Premise,Auto Fix Issue and Self Health To Depoly Muti Cloud Also As Your Aiops Assistant.☆18Jul 28, 2025Updated 7 months ago
- ☆16Jun 10, 2025Updated 8 months ago
- Code and Data for ACL 2025 Paper "Aristotle: Mastering Logical Reasoning with A Logic-Complete Decompose-Search-Resolve Framework".☆24Oct 3, 2025Updated 5 months ago
- ☆13Sep 26, 2025Updated 5 months ago
- 《多模态大模型部署微调指南》快速部署/微调多模态大模型☆12Dec 4, 2024Updated last year
- 李宏毅机器学习中的PM2.5预测的实现☆12Nov 1, 2020Updated 5 years ago
- ☆10Sep 23, 2020Updated 5 years ago
- The reproduction of the paper "Robust Attention for Contextual Biased Visual Recognition" ICLR2023.☆13Feb 23, 2024Updated 2 years ago
- Try HopWeaver: The first automatic synthesis framework based on any corpora, with quality approaching manual annotation.☆23Jul 24, 2025Updated 7 months ago
- Code for L4DC 2022 paper: Joint Synthesis of Safety Certificate and Safe Control Policy Using Constrained Reinforcement Learning.☆15Jul 31, 2023Updated 2 years ago
- This repo is the official implementation of "Euclid’s Gift: Enhancing Spatial Perception and Reasoning in Vision‑Language Models via Geom…☆27Nov 7, 2025Updated 3 months ago
- 东北大学计算机系统实验课程 流水线CPU的设计☆15Jan 9, 2022Updated 4 years ago
- 黄金矿工-用LiveServer运行☆12Jun 12, 2022Updated 3 years ago
- 集成Qwen与DeepSeek等先进大语言模型,支持纯LLM+分类层模式及LLM+LoRA+分类层模式,使用transformers模块化设计和训练便于根据需要调整或替换组件。☆19Sep 1, 2025Updated 6 months ago
- Train a tiny LLaMA model from scratch to repeat your words using Reinforcement Learning from Human Feedback (RLHF)☆17May 23, 2024Updated last year
- 东南大学计算机组织与结构Ⅱ大作业☆11May 18, 2020Updated 5 years ago
- A2A Concept☆13Apr 10, 2025Updated 10 months ago
- [EMNLP 2025] Reasoning-to-Defend: Safety-Aware Reasoning Can Defend Large Language Models from Jailbreaking☆12Aug 22, 2025Updated 6 months ago
- Service Mesh Glossary 术语表☆13Apr 29, 2019Updated 6 years ago
- Code for Multi-Aspect Cross-modal Quantization for Generative Recommendation. (AAAI 2026 Oral)☆30Dec 9, 2025Updated 2 months ago
- A curated collection of research and techniques for protecting intellectual property of large language models, including watermarking, fi…☆46Feb 15, 2026Updated 2 weeks ago
- ☆14Apr 20, 2025Updated 10 months ago
- Koishi's Day 2025 Paper (NeurIPS 2025): "Codifying Character Logic in Role-Playing"☆23Jan 15, 2026Updated last month
- ☆33Feb 8, 2026Updated 3 weeks ago