instructlab / sdg
Python library for Synthetic Data Generation
☆40Updated this week
Alternatives and similar repositories for sdg:
Users that are interested in sdg are comparing it to the libraries listed below
- InstructLab Training Library - Efficient Fine-Tuning with Message-Format Data☆33Updated this week
- IBM development fork of https://github.com/huggingface/text-generation-inference☆60Updated 3 months ago
- Python library for Evaluation☆14Updated 2 weeks ago
- 🚀 Collection of tuning recipes with HuggingFace SFTTrainer and PyTorch FSDP.☆39Updated this week
- Taxonomy tree that will allow you to create models tuned with your data☆255Updated this week
- 🦄 Unitxt: a python library for getting data fired up and set for training and evaluation☆183Updated this week
- Place to hack on UI for InstructLab☆26Updated this week
- codebase release for EMNLP2023 paper publication☆19Updated last year
- GitHub bot to assist with the taxonomy contribution workflow☆16Updated 5 months ago
- ☆38Updated 3 weeks ago
- The Granite Guardian models are designed to detect risks in prompts and responses.☆77Updated 3 weeks ago
- Dolomite Engine is a library for pretraining/finetuning LLMs☆47Updated this week
- ☆12Updated last week
- Estimate resources needed to train LLMs☆13Updated last month
- InstructLab Community wide collaboration space including contributing, security, code of conduct, etc☆89Updated this week
- ☆255Updated 4 months ago
- Mixing Language Models with Self-Verification and Meta-Verification☆103Updated 4 months ago
- "Syntriever: How to Train Your Retriever with Synthetic Data from LLMs" the Nations of the Americas Chapter of the Association for Comput…☆24Updated last month
- Using open source LLMs to build synthetic datasets for direct preference optimization☆59Updated last year
- Improve ROSA customer experience (and customer retention) by leveraging foundation models to do “gpt-chat” style search of Red Hat custo…☆27Updated last year
- Pre-training code for CrystalCoder 7B LLM☆54Updated 11 months ago
- Build Enterprise RAG (Retriver Augmented Generation) Pipelines to tackle various Generative AI use cases with LLM's by simply plugging co…☆108Updated 8 months ago
- Caikit is an AI toolkit that enables users to manage models through a set of developer friendly APIs.☆104Updated 6 months ago
- Build document-native LLM applications☆53Updated 7 months ago
- Complex Function Calling Benchmark.☆92Updated 2 months ago
- ☆40Updated 2 months ago
- Advanced Reasoning Benchmark Dataset for LLMs☆45Updated last year
- ☆12Updated 9 months ago
- Data preparation code for CrystalCoder 7B LLM☆44Updated 11 months ago
- CRMArena: Understanding the Capacity of LLM Agents to Perform Professional CRM Tasks in Realistic Environments☆50Updated last month