practical-dreamer / build-a-dataset
A set of Python scripts to generate complex datasets using OpenAI API. Still under active development, may contain bugs. Contributions are welcome.
☆11Updated last year
Alternatives and similar repositories for build-a-dataset:
Users that are interested in build-a-dataset are comparing it to the libraries listed below
- Merge Transformers language models by use of gradient parameters.☆205Updated 6 months ago
- Full finetuning of large language models without large memory requirements☆93Updated last year
- ☆111Updated 2 months ago
- Low-Rank adapter extraction for fine-tuned transformers models☆169Updated 9 months ago
- Simple examples using Argilla tools to build AI☆53Updated 3 months ago
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.☆82Updated last year
- Easy to use, High Performant Knowledge Distillation for LLMs☆46Updated last month
- Experimental sampler to make LLMs more creative☆30Updated last year
- ☆74Updated last year
- A pipeline for LLM knowledge distillation☆89Updated 3 weeks ago
- An unsupervised model merging algorithm for Transformers-based language models.☆104Updated 9 months ago
- ☆75Updated 10 months ago
- A repository to store helpful information and emerging insights in regard to LLMs☆20Updated last year
- ☆74Updated last year
- Some simple scripts that I use day-to-day when working with LLMs and Huggingface Hub☆157Updated last year
- Convenient wrapper for fine-tuning and inference of Large Language Models (LLMs) with several quantization techniques (GTPQ, bitsandbytes…☆147Updated last year
- QLoRA: Efficient Finetuning of Quantized LLMs☆77Updated 10 months ago
- ☆65Updated 8 months ago
- This is our own implementation of 'Layer Selective Rank Reduction'☆233Updated 8 months ago
- This repository implements the chain of verification paper by Meta AI☆163Updated last year
- A very simple interactive demo to understand the common LLM samplers.☆25Updated 7 months ago
- Model REVOLVER, a human in the loop model mixing system.☆33Updated last year
- Modified Stanford-Alpaca Trainer for Training Replit's Code Model☆40Updated last year
- Collection of various text datasets to assist ML researchers in training or fine-tuning their models☆20Updated last year
- Simple setup to self-host LLaMA3-70B model with an OpenAI API☆20Updated 9 months ago
- Guide on text completion large language model fine-tuning, including example scripts and training data acquiring.☆62Updated 2 months ago
- Self-hosted LLM chatbot arena, with yourself as the only judge☆36Updated last year
- never forget anything again! combine AI and intelligent tooling for a local knowledge base to track catalogue, annotate, and plan for you…☆37Updated 9 months ago
- ☆152Updated 7 months ago
- Auto Data is a library designed for quick and effortless creation of datasets tailored for fine-tuning Large Language Models (LLMs).☆97Updated 3 months ago