wangclnlp / DeepSpeed-Chat-ExtensionLinks
This repo contains some extensions of deepspeed-chat for fine-tuning LLMs (SFT+RLHF).
☆21Updated last year
Alternatives and similar repositories for DeepSpeed-Chat-Extension
Users that are interested in DeepSpeed-Chat-Extension are comparing it to the libraries listed below
Sorting:
- This is the code of MMOA-RAG.☆102Updated 8 months ago
- Reinforced Multi-LLM Agents training☆70Updated 3 weeks ago
- Public code repo for COLING 2025 paper "Aligning LLMs with Individual Preferences via Interaction"☆41Updated 10 months ago
- [ACL2024] A Codebase for Incremental Learning with Large Language Models; Official released code for "Learn or Recall? Revisiting Increme…☆58Updated last year
- [ICLR 2025] Code&Data for the paper "Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization"☆13Updated last year
- This my attempt to create Self-Correcting-LLM based on the paper Training Language Models to Self-Correct via Reinforcement Learning by g…☆38Updated 7 months ago
- code for paper Query-Dependent Prompt Evaluation and Optimization with Offline Inverse Reinforcement Learning☆43Updated last year
- code for ACL2024-main: BatchEval: Towards Human-like Text Evaluation☆19Updated last year
- Source code of “Reinforcement Learning with Token-level Feedback for Controllable Text Generation (NAACL 2024)☆17Updated last year
- This is the official GitHub repository for our survey paper "Beyond Single-Turn: A Survey on Multi-Turn Interactions with Large Language …☆171Updated 8 months ago
- Data and code for the paper: Finding Safety Neurons in Large Language Models☆20Updated last week
- Analyzing and Reducing Catastrophic Forgetting in Parameter Efficient Tuning☆36Updated last year
- [ACL'24] A Knowledge-grounded Interactive Evaluation Framework for Large Language Models☆39Updated last year
- ☆23Updated 11 months ago
- Official Repo for FoodieQA paper (EMNLP 2024)☆19Updated 7 months ago
- [ACL 2025] Knowledge Unlearning for Large Language Models☆48Updated 4 months ago
- Code for ACL 2024 accepted paper titled "SAPT: A Shared Attention Framework for Parameter-Efficient Continual Learning of Large Language …☆38Updated last year
- KnowRL: Exploring Knowledgeable Reinforcement Learning for Factuality☆39Updated 2 months ago
- ☆21Updated last year
- [ICLR 2025] Language Imbalance Driven Rewarding for Multilingual Self-improving☆24Updated 5 months ago
- Code for the paper: Metacognitive Retrieval-Augmented Large Language Models☆34Updated last year
- ☆45Updated last month
- DICE: Detecting In-distribution Data Contamination with LLM's Internal State☆11Updated last year
- ☆76Updated 3 months ago
- ☆26Updated 5 months ago
- Official repo for "AlignGPT: Multi-modal Large Language Models with Adaptive Alignment Capability"☆34Updated last year
- [ACL'24] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization☆96Updated last year
- [EMNLP 2024] ”ESC-Eval: Evaluating Emotion Support Conversations in Large Language Models“☆26Updated last year
- Reference implementation for Token-level Direct Preference Optimization(TDPO)☆151Updated 11 months ago
- [SIGIR'24] The official implementation code of MOELoRA.☆36Updated last year