practical-dreamer / build-a-dataset

A set of Python scripts to generate complex datasets using OpenAI API. Still under active development, may contain bugs. Contributions are welcome.

☆11

Alternatives and similar repositories for build-a-dataset:

Users that are interested in build-a-dataset are comparing it to the libraries listed below

Gryphe / BlockMerge_Gradient
Merge Transformers language models by use of gradient parameters.
☆205Updated 6 months ago
euclaise / SlimTrainer
Full finetuning of large language models without large memory requirements
☆93Updated last year
teknium1 / ShareGPT-Builder
☆111Updated 2 months ago
thomasgauthier / LoRD
Low-Rank adapter extraction for fine-tuned transformers models
☆169Updated 9 months ago
argilla-io / argilla-cookbook
Simple examples using Argilla tools to build AI
☆53Updated 3 months ago
AblateIt / finetune-study
Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.
☆82Updated last year
agokrani / distillKitPlus
Easy to use, High Performant Knowledge Distillation for LLMs
☆46Updated last month
the-crypt-keeper / the-muse
Experimental sampler to make LLMs more creative
☆30Updated last year
emrgnt-cmplxty / zero-shot-replication
☆74Updated last year
golololologol / LLM-Distillery
A pipeline for LLM knowledge distillation
☆89Updated 3 weeks ago
Gryphe / MergeMonster
An unsupervised model merging algorithm for Transformers-based language models.
☆104Updated 9 months ago
nateraw / replicate-examples
☆75Updated 10 months ago
j-webtek / LLM-Learning
A repository to store helpful information and emerging insights in regard to LLMs
☆20Updated last year
swj0419 / detect-pretrain-code-contamination
☆74Updated last year
TheBlokeAI / AIScripts
Some simple scripts that I use day-to-day when working with LLMs and Huggingface Hub
☆157Updated last year
taprosoft / llm_finetuning
Convenient wrapper for fine-tuning and inference of Large Language Models (LLMs) with several quantization techniques (GTPQ, bitsandbytes…
☆147Updated last year
jondurbin / qlora
QLoRA: Efficient Finetuning of Quantized LLMs
☆77Updated 10 months ago
cognitivecomputations / kraken
☆65Updated 8 months ago
cognitivecomputations / laserRMT
This is our own implementation of 'Layer Selective Rank Reduction'
☆233Updated 8 months ago
ritun16 / chain-of-verification
This repository implements the chain of verification paper by Meta AI
☆163Updated last year
Artefact2 / llm-sampling
A very simple interactive demo to understand the common LLM samplers.
☆25Updated 7 months ago
Digitous / ModelREVOLVER
Model REVOLVER, a human in the loop model mixing system.
☆33Updated last year
teknium1 / stanford_alpaca-replit
Modified Stanford-Alpaca Trainer for Training Replit's Code Model
☆40Updated last year
paniphons / open-textbot-datasets
Collection of various text datasets to assist ML researchers in training or fine-tuning their models
☆20Updated last year
catid / oaillama3
Simple setup to self-host LLaMA3-70B model with an OpenAI API
☆20Updated 9 months ago
molbal / llm-text-completion-finetune
Guide on text completion large language model fine-tuning, including example scripts and training data acquiring.
☆62Updated 2 months ago
Contextualist / lone-arena
Self-hosted LLM chatbot arena, with yourself as the only judge
☆36Updated last year
Alignment-Lab-AI / KnowledgeBase
never forget anything again! combine AI and intelligent tooling for a local knowledge base to track catalogue, annotate, and plan for you…
☆37Updated 9 months ago
cognitivecomputations / OpenChatML
☆152Updated 7 months ago
Itachi-Uchiha581 / Auto-Data
Auto Data is a library designed for quick and effortless creation of datasets tailored for fine-tuning Large Language Models (LLMs).
☆97Updated 3 months ago