microsoft / llm-data-creationLinks
Model, Code & Data for the EMNLP'23 paper "Making Large Language Models Better Data Creators"
☆131Updated last year
Alternatives and similar repositories for llm-data-creation
Users that are interested in llm-data-creation are comparing it to the libraries listed below
Sorting:
- Codebase accompanying the Summary of a Haystack paper.☆78Updated 8 months ago
- EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language M…☆220Updated 7 months ago
- LongEmbed: Extending Embedding Models for Long Context Retrieval (EMNLP 2024)☆135Updated 6 months ago
- awesome synthetic (text) datasets☆281Updated 7 months ago
- Code repo for "Agent Instructs Large Language Models to be General Zero-Shot Reasoners"☆110Updated 8 months ago
- Repository for “PlanRAG: A Plan-then-Retrieval Augmented Generation for Generative Large Language Models as Decision Makers”, NAACL24☆138Updated 11 months ago
- ☆143Updated 10 months ago
- Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.☆77Updated 7 months ago
- 🔧 Compare how Agent systems perform on several benchmarks. 📊🚀☆97Updated 7 months ago
- RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo ranker☆111Updated 2 weeks ago
- Github repository for "RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models"☆180Updated 6 months ago
- Official repository for paper "ReasonIR Training Retrievers for Reasoning Tasks".☆162Updated last month
- AIR-Bench: Automated Heterogeneous Information Retrieval Benchmark☆142Updated 5 months ago
- Official repository for RAGViz: Diagnose and Visualize Retrieval-Augmented Generation [EMNLP 2024]☆83Updated 4 months ago
- Official Implementation of "Multi-Head RAG: Solving Multi-Aspect Problems with LLMs"☆208Updated last week
- [ACL'24] Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning☆353Updated 8 months ago
- [NeurIPS 2023] This is the code for the paper `Large Language Model as Attributed Training Data Generator: A Tale of Diversity and Bias`.☆150Updated last year
- Code accompanying "How I learned to start worrying about prompt formatting".☆105Updated 8 months ago
- ARAGOG- Advanced RAG Output Grading. Exploring and comparing various Retrieval-Augmented Generation (RAG) techniques on AI research paper…☆104Updated last year
- Meta-CoT: Generalizable Chain-of-Thought Prompting in Mixed-task Scenarios with Large Language Models☆96Updated last year
- ☆120Updated 8 months ago
- A simplified implementation for experimenting with RLVR on GSM8K, This repository provides a starting point for exploring reasoning.☆95Updated 3 months ago
- 🚢 Data Toolkit for Sailor Language Models☆91Updated 3 months ago
- Dense X Retrieval: What Retrieval Granularity Should We Use?☆157Updated last year
- ☆118Updated 9 months ago
- Small and Efficient Mathematical Reasoning LLMs☆71Updated last year
- Manage scalable open LLM inference endpoints in Slurm clusters☆258Updated 10 months ago
- [Preprint] Learning to Filter Context for Retrieval-Augmented Generaton☆192Updated last year
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆49Updated 10 months ago
- Official repo for "LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs".☆231Updated 9 months ago