wangclnlp / DeepSpeed-Chat-ExtensionLinks
This repo contains some extensions of deepspeed-chat for fine-tuning LLMs (SFT+RLHF).
☆19Updated last year
Alternatives and similar repositories for DeepSpeed-Chat-Extension
Users that are interested in DeepSpeed-Chat-Extension are comparing it to the libraries listed below
Sorting:
- Analyzing and Reducing Catastrophic Forgetting in Parameter Efficient Tuning☆34Updated 8 months ago
- code for ACL2024-main: BatchEval: Towards Human-like Text Evaluation☆18Updated last year
- This my attempt to create Self-Correcting-LLM based on the paper Training Language Models to Self-Correct via Reinforcement Learning by g…☆35Updated last week
- [ICLR 2025] Code&Data for the paper "Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization"☆13Updated last year
- The official Github repository for paper "R^2AG: Incorporating Retrieval Information into Retrieval Augmented Generation" (EMNLP 2024 Fin…☆33Updated 7 months ago
- Source code of “Reinforcement Learning with Token-level Feedback for Controllable Text Generation (NAACL 2024)☆14Updated 7 months ago
- [NeurIPS 2024] HonestLLM: Toward an Honest and Helpful Large Language Model☆26Updated last month
- [ACL2024] A Codebase for Incremental Learning with Large Language Models; Official released code for "Learn or Recall? Revisiting Increme…☆49Updated 5 months ago
- Official Repo for FoodieQA paper (EMNLP 2024)☆16Updated 3 weeks ago
- ☆26Updated 3 months ago
- Code for "CREAM: Consistency Regularized Self-Rewarding Language Models", ICLR 2025.☆22Updated 5 months ago
- Code Repo for EfficientRAG: Efficient Retriever for Multi-Hop Question Answering☆51Updated 4 months ago
- Codebase for Math Neurosurgery: Isolating LLMs' Math Reasoning Abilities Using Only Forward Passes☆17Updated last month
- A curated list of personalized alignment resources (continually updated).☆34Updated 2 weeks ago
- Code for ProTrix: Building Models for Planning and Reasoning over Tables with Sentence Context☆18Updated 8 months ago
- [ICLR'25] DataGen: Unified Synthetic Dataset Generation via Large Language Models☆56Updated 4 months ago
- [EMNLP 2024] ”ESC-Eval: Evaluating Emotion Support Conversations in Large Language Models“☆18Updated last year
- A method of ensemble learning for heterogeneous large language models.☆58Updated 11 months ago
- [SIGIR'24] The official implementation code of MOELoRA.☆173Updated last year
- ☆19Updated 10 months ago
- Official repo for "AlignGPT: Multi-modal Large Language Models with Adaptive Alignment Capability"☆32Updated last year
- ☆145Updated last year
- The implement of paper:"ReDeEP: Detecting Hallucination in Retrieval-Augmented Generation via Mechanistic Interpretability"☆29Updated last month
- Public code repo for COLING 2025 paper "Aligning LLMs with Individual Preferences via Interaction"☆32Updated 3 months ago
- RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models. NeurIPS 2024☆77Updated 9 months ago
- Official code for ICML 2024 paper on Persona In-Context Learning (PICLe)☆25Updated last year
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"☆38Updated last year
- [ACL'24] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization☆83Updated 11 months ago
- code for paper Query-Dependent Prompt Evaluation and Optimization with Offline Inverse Reinforcement Learning