microsoft / llm-data-creation
Model, Code & Data for the EMNLP'23 paper "Making Large Language Models Better Data Creators"
☆107Updated 11 months ago
Related projects: ⓘ
- ☆109Updated last month
- EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language M…☆118Updated last week
- experiments with inference on llama☆106Updated 3 months ago
- Codebase accompanying the Summary of a Haystack paper.☆65Updated 2 months ago
- awesome synthetic (text) datasets☆213Updated this week
- ☆127Updated 2 months ago
- ☆73Updated 8 months ago
- Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.☆64Updated 2 months ago
- This is the reproduction repository for my 🤗 Hugging Face blog post on synthetic data☆57Updated 7 months ago
- Benchmark various LLM Structured Output frameworks: Instructor, Mirascope, Langchain, LlamaIndex, Fructose, Marvin, Outlines, etc on task…☆117Updated 3 weeks ago
- Official implementation for the paper "LongEmbed: Extending Embedding Models for Long Context Retrieval"☆108Updated 4 months ago
- An Open Source Toolkit For LLM Distillation☆284Updated last month
- RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo ranker☆101Updated last week
- Repository for “PlanRAG: A Plan-then-Retrieval Augmented Generation for Generative Large Language Models as Decision Makers”, NAACL24☆115Updated 3 months ago
- Spherical Merge Pytorch/HF format Language Models with minimal feature loss.☆107Updated last year
- ☆82Updated 3 weeks ago
- Expert Specialized Fine-Tuning☆129Updated last month
- Manage scalable open LLM inference endpoints in Slurm clusters☆217Updated 2 months ago
- Code for explaining and evaluating late chunking (chunked pooling)☆117Updated this week
- Small and Efficient Mathematical Reasoning LLMs☆69Updated 7 months ago
- Dense X Retrieval: What Retrieval Granularity Should We Use?☆120Updated 8 months ago
- ☆85Updated 7 months ago
- ☆105Updated this week
- Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)☆195Updated 3 months ago
- ☆118Updated 5 months ago
- ☆75Updated 3 weeks ago
- This is an implementation of the paper: Searching for Best Practices in Retrieval-Augmented Generation☆157Updated 3 weeks ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆48Updated 2 months ago
- The official evaluation suite and dynamic data release for MixEval.☆200Updated this week
- Code repo for "Agent Instructs Large Language Models to be General Zero-Shot Reasoners"☆68Updated last week