Model, Code & Data for the EMNLP'23 paper "Making Large Language Models Better Data Creators"
☆137Oct 19, 2023Updated 2 years ago
Alternatives and similar repositories for llm-data-creation
Users that are interested in llm-data-creation are comparing it to the libraries listed below
Sorting:
- Targeted Data Generation with Large Language Models☆19Jun 25, 2024Updated last year
- ☆13Nov 5, 2024Updated last year
- Code Roberta version of RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked Auto-Encoder☆10Mar 16, 2023Updated 2 years ago
- ☆10Mar 16, 2024Updated last year
- Official repository for Trustworthy Alignment of Retrieval-Augmented Large Language Models via Reinforcement Learning☆12Sep 2, 2024Updated last year
- ICML 2025 Spotlight, PCEvolve: Private Contrastive Evolution for Synthetic Dataset Generation via Few-Shot Private Data and Generative AP…☆14Jun 27, 2025Updated 8 months ago
- ☆11Dec 15, 2025Updated 2 months ago
- This repository contains the code for the EMNLP'23 paper "AdaSent: Efficient Domain-Adapted Sentence Embeddings for Few-Shot Classificati…☆16Jun 3, 2024Updated last year
- Semantic Functions for Semantic Link☆14Dec 3, 2025Updated 2 months ago
- AI Developer Plugin for Eclipse☆13May 17, 2024Updated last year
- Slim tool definitions. Auto-compressed responses. Context efficiency on both end. Ships with 7 example servers, reduces context consumpti…☆54Jan 24, 2026Updated last month
- Creates an Azure AI Service and deploys the specified models.☆18Aug 22, 2025Updated 6 months ago
- ☆30Aug 2, 2024Updated last year
- 🔎 A deep-dive into HyDE for Advanced LLM RAG + 💡 Introducing AutoHyDE, a semi-supervised framework to improve the effectiveness, covera…☆34Mar 26, 2024Updated last year
- Your Python AI Coder!☆36May 21, 2025Updated 9 months ago
- [AAAI 2024] DenoSent: A Denoising Objective for Self-Supervised Sentence Representation Learning☆15Apr 29, 2024Updated last year
- ☆13Jun 26, 2024Updated last year
- [shiny app]: an interface to rio☆16Apr 7, 2018Updated 7 years ago
- Collection of scripts to pretrain T5 in unsupervised text, using PyTorch Lightning. CORD-19 pretraining provided as example.☆32Apr 26, 2021Updated 4 years ago
- Auto Data is a library designed for quick and effortless creation of datasets tailored for fine-tuning Large Language Models (LLMs).☆105Oct 31, 2024Updated last year
- ☆13Oct 12, 2016Updated 9 years ago
- AI Starter Kit for AI applications in Drone technology using Intel® Optimized Tensorflow*☆18May 8, 2024Updated last year
- Exploratory Data Analysis of the engine simulation data in dataset 6, subset FD001, from https://ti.arc.nasa.gov/tech/dash/groups/pcoe/pr…☆16Apr 11, 2018Updated 7 years ago
- extracting "structured" information that is embedded in natural language text on the web using iterative set expansion, spanBERT, and ope…☆17May 22, 2023Updated 2 years ago
- ☆20Aug 5, 2024Updated last year
- Synthetic Data for LLM Fine-Tuning☆121Dec 5, 2023Updated 2 years ago
- Everything for the Paper: 'Evoke: Evoking Critical Thinking Abilities in LLMs via Reviewer-Author Prompt Editing'☆19Dec 2, 2023Updated 2 years ago
- Generate Structured JSON with probs from Language Models☆17Mar 23, 2025Updated 11 months ago
- [ICLR 2025] Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing. Your efficient and high-quality synthetic data …☆829Mar 17, 2025Updated 11 months ago
- ☆18Jan 15, 2024Updated 2 years ago
- Official implementation of the paper "From Complex to Simple: Enhancing Multi-Constraint Complex Instruction Following Ability of Large L…☆53Jun 24, 2024Updated last year
- Streaming AI assistant with ChatGPT, FastAPI, WebSockets and React ✨🤖🚀☆26Nov 12, 2023Updated 2 years ago
- PubMed Healthcare Chatbot. LLM Augmented Q&A over PubMed Search Engine.☆27Jan 21, 2024Updated 2 years ago
- Reference code base for ML Engineering in Action, Manning Publications Author: Ben Wilson☆20Oct 22, 2023Updated 2 years ago
- ☆47Feb 7, 2024Updated 2 years ago
- This C# demo is based on azure-search-openai-demo and uses a static web app for the frontend and Azure functions for the backend API's. T…☆27Jan 22, 2025Updated last year
- ☆24Dec 12, 2024Updated last year
- FastAPI wrapper around DSPy☆292Mar 11, 2024Updated last year
- [ACL'24] Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning☆367Sep 6, 2024Updated last year