LanD-FBK / prodigy-datasetLinks
PRODIGy is a collection of dialogues in which each conversation is aligned with speaker profile representations.
☆19Updated 10 months ago
Alternatives and similar repositories for prodigy-dataset
Users that are interested in prodigy-dataset are comparing it to the libraries listed below
Sorting:
- Scripts for generating synthetic finetuning data for reducing sycophancy.☆117Updated 2 years ago
- ☆156Updated last year
- ☆74Updated last year
- ☆68Updated 2 years ago
- [NeurIPS 2023] This is the code for the paper `Large Language Model as Attributed Training Data Generator: A Tale of Diversity and Bias`.☆155Updated 2 years ago
- Reverse Instructions to generate instruction tuning data with corpus examples☆216Updated last year
- ☆53Updated last year
- ☆49Updated 2 years ago
- ☆34Updated last year
- On Transferability of Prompt Tuning for Natural Language Processing☆100Updated last year
- Code and Data for "Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering"☆86Updated last year
- [EMNLP 2023] The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning☆248Updated 2 years ago
- [ACL 2024] LangBridge: Multilingual Reasoning Without Multilingual Supervision☆95Updated last year
- The Synthetic-Persona-Chat dataset is a synthetically generated persona-based dialogue dataset. It extends the original Persona-Chat data…☆104Updated last year
- Unofficial implementation of AlpaGasus☆93Updated 2 years ago
- Code for "Democratizing Reasoning Ability: Tailored Learning from Large Language Model", EMNLP 2023☆36Updated last year
- [ICLR 2023] Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners☆116Updated 4 months ago
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers☆135Updated last year
- [NeurIPS 2024] Train LLMs with diverse system messages reflecting individualized preferences to generalize to unseen system messages☆52Updated 3 months ago
- Code and model release for the paper "Task-aware Retrieval with Instructions" by Asai et al.☆165Updated 2 years ago
- ☆129Updated last year
- Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]☆149Updated last year
- A simple GPT-based evaluation tool for multi-aspect, interpretable assessment of LLMs.☆89Updated last year
- Benchmarking LLMs' Emotional Alignment with Humans☆114Updated 9 months ago
- This is a repository for sharing papers in the field of persona-based conversational AI. The related source code for each paper is linked…☆168Updated last year
- Codebase for LLM story generation; updated version of https//github.com/yangkevin2/doc-story-generation☆86Updated last year
- [ICLR 2025] InstructRAG: Instructing Retrieval-Augmented Generation via Self-Synthesized Rationales☆131Updated 9 months ago
- Github repository for "RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models"☆211Updated 11 months ago
- Code, datasets, models for the paper "Automatic Evaluation of Attribution by Large Language Models"☆56Updated 2 years ago
- A dataset of LLM-generated chain-of-thought steps annotated with mistake location.☆83Updated last year