instructlab / sdgLinks

Python library for Synthetic Data Generation

☆42

Alternatives and similar repositories for sdg

Users that are interested in sdg are comparing it to the libraries listed below

Sorting:

instructlab / training
InstructLab Training Library - Efficient Fine-Tuning with Message-Format Data
☆42Updated this week
instructlab / eval
Python library for Evaluation
☆15Updated this week
foundation-model-stack / fms-hf-tuning
🚀 Collection of tuning recipes with HuggingFace SFTTrainer and PyTorch FSDP.
☆47Updated this week
IBM / text-generation-inference
IBM development fork of https://github.com/huggingface/text-generation-inference
☆61Updated 3 months ago
instructlab / taxonomy
Taxonomy tree that will allow you to create models tuned with your data
☆274Updated last week
IBM / unitxt
🦄 Unitxt is a Python library for enterprise-grade evaluation of AI performance, offering the world's largest catalog of tools and data …
☆206Updated this week
ibm-granite / granite-3.0-language-models
☆261Updated last month
patronus-ai / Lynx-hallucination-detection
☆41Updated last year
HishamAlyahya / semantic_backprop
Source code of "How to Correctly do Semantic Backpropagation on Language-based Agentic Systems" 🤖
☆72Updated 8 months ago
facebookresearch / matrix
Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…
☆81Updated last week
MinishLab / tokenlearn
Pre-train Static Word Embeddings
☆85Updated 2 months ago
Pleias / Pleias-RAG-Library
Python library to use Pleias-RAG models
☆61Updated 3 months ago
mozilla-ai / lm-buddy
Your buddy in the (L)LM space.
☆64Updated 10 months ago
davanstrien / haiku-dpo
Using open source LLMs to build synthetic datasets for direct preference optimization
☆65Updated last year
guidance-ai / jsonschemabench
☆51Updated last month
ibm-granite / granite-guardian
The Granite Guardian models are designed to detect risks in prompts and responses.
☆93Updated this week
IBM / ensemble-instruct
codebase release for EMNLP2023 paper publication
☆19Updated 3 months ago
weaviate / structured-rag
Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models
☆111Updated 3 months ago
javyduck / KnowHalu
☆48Updated last year
trapoom555 / Language-Model-STS-CFT
Improving Text Embedding of Language Models Using Contrastive Fine-tuning
☆64Updated last year
ServiceNow / Fast-LLM
Accelerating your LLM training to full speed! Made with ❤️ by ServiceNow Research
☆218Updated this week
automix-llm / automix
Mixing Language Models with Self-Verification and Meta-Verification
☆105Updated 7 months ago
Nicolas-BZRD / EuroBERT
Optimus is a flexible and scalable framework built to train language models efficiently across diverse hardware configurations, including…
☆66Updated last month
jina-ai / correlations
Simple UI for debugging correlations of text embeddings
☆288Updated 2 months ago
foundation-model-stack / bamba
Train, tune, and infer Bamba model
☆130Updated 2 months ago
Arize-ai / LLMTest_NeedleInAHaystack
Doing simple retrieval from LLM models at various context lengths to measure accuracy
☆102Updated last year
cfahlgren1 / observers
A Lightweight Library for AI Observability
☆250Updated 5 months ago
Knowledgator / FlashDeBERTa
Trully flash implementation of DeBERTa disentangled attention mechanism.
☆62Updated 2 months ago
openshift-psap / llm-load-test
☆47Updated last week
AnswerDotAI / ModernBERT-Instruct-mini-cookbook
☆49Updated 5 months ago