HowieHwong / DataGenLinks

[ICLR'25] DataGen: Unified Synthetic Dataset Generation via Large Language Models

☆64

Alternatives and similar repositories for DataGen

Users that are interested in DataGen are comparing it to the libraries listed below

Sorting:

sail-sg / sdft
[ACL 2024] The official codebase for the paper "Self-Distillation Bridges Distribution Gap in Language Model Fine-tuning".
☆136Updated last year
pillowsofwind / Knowledge-Conflicts-Survey
[EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"
☆145Updated last year
open-compass / ANAH
[ACL 2024] ANAH & [NeurIPS 2024] ANAH-v2 & [ICLR 2025] Mask-DPO
☆57Updated 6 months ago
Jeryi-Sun / ReDEeP-ICLR
The implement of paper:"ReDeEP: Detecting Hallucination in Retrieval-Augmented Generation via Mechanistic Interpretability"
☆50Updated 5 months ago
RUCAIBox / HaluEval-2.0
☆47Updated last year
zhiyuanhubj / UoT
[NeurIPS 2024] Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances Information Seeking in Large Language Models
☆103Updated last year
RUCKBReasoning / CoT-based-Synthesizer
Official code implementation for the ACL 2025 paper: 'CoT-based Synthesizer: Enhancing LLM Performance through Answer Synthesis'
☆31Updated 6 months ago
shizhediao / R-Tuning
[NAACL 2024 Outstanding Paper] Source code for the NAACL 2024 paper entitled "R-Tuning: Instructing Large Language Models to Say 'I Don't…
☆125Updated last year
weizhepei / InstructRAG
[ICLR 2025] InstructRAG: Instructing Retrieval-Augmented Generation via Self-Synthesized Rationales
☆131Updated 9 months ago
LightChen233 / reasoning-boundary
☆69Updated 5 months ago
pldlgb / nuggets
☆86Updated last year
tianyi-lab / Superfiltering
[ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning
☆181Updated 4 months ago
ZitongYang / Synthetic_Continued_Pretraining
Code implementation of synthetic continued pretraining
☆138Updated 10 months ago
alisawuffles / proxy-tuning
Code associated with Tuning Language Models by Proxy (Liu et al., 2024)
☆121Updated last year
dongguanting / DPA-RAG
The code and data of DPA-RAG, accepted by WWW 2025 main conference.
☆63Updated 3 weeks ago
yubol-bobo / Awesome-Multi-Turn-LLMs
This is the official GitHub repository for our survey paper "Beyond Single-Turn: A Survey on Multi-Turn Interactions with Large Language …
☆139Updated 6 months ago
Hannibal046 / xRAG
[Neurips2024] Source code for xRAG: Extreme Context Compression for Retrieval-augmented Generation with One Token
☆161Updated last year
PKU-Alignment / aligner
[NeurIPS 2024 Oral] Aligner: Efficient Alignment by Learning to Correct
☆190Updated 10 months ago
starrYYxuan / LeCo
This the implementation of LeCo
☆31Updated 10 months ago
SparkJiao / dpo-trajectory-reasoning
[EMNLP 2024] Source code for the paper "Learning Planning-based Reasoning with Trajectory Collection and Process Rewards Synthesizing".
☆82Updated 10 months ago
THU-KEG / RM-Bench
[ICLR 25 Oral] RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style
☆67Updated 4 months ago
sail-sg / CPO
[NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.
☆132Updated 8 months ago
AmourWaltz / Reliable-LLM
☆170Updated last year
thu-coai / AutoDetect
Official github repo for AutoDetect, an automated weakness detection framework for LLMs.
☆44Updated last year
hkust-nlp / Activation_Decoding
In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation (ICML 2024)
☆62Updated last year
jlko / long_hallucinations
Codebase for reproducing the experiments of the semantic uncertainty paper (paragraph-length experiments).
☆75Updated last year
liuqidong07 / MOELoRA-peft
[SIGIR'24] The official implementation code of MOELoRA.
☆184Updated last year
OpenMOSS / Say-I-Dont-Know
[ICML'2024] Can AI Assistants Know What They Don't Know?
☆83Updated last year
TianHongZXY / RLVR-Decomposed
[NeurIPS 2025] Implementation for the paper "The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning"
☆125Updated 3 weeks ago
OSU-NLP-Group / LLM-Knowledge-Conflict
[ICLR'24 Spotlight] "Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts"
☆77Updated last year