clean up your LLM datasets
☆113May 30, 2023Updated 2 years ago
Alternatives and similar repositories for ambrosia
Users that are interested in ambrosia are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.☆83Sep 10, 2023Updated 2 years ago
- ☆45Oct 13, 2023Updated 2 years ago
- ☆22Aug 27, 2023Updated 2 years ago
- A Simple Discord Bot for the Alpaca LLM☆98Jun 22, 2023Updated 2 years ago
- A library for squeakily cleaning and filtering language datasets.☆50Jul 10, 2023Updated 2 years ago
- Public reports detailing responses to sets of prompts by Large Language Models.☆32Jan 4, 2025Updated last year
- A collection of modular datasets generated by GPT-4, General-Instruct - Roleplay-Instruct - Code-Instruct - and Toolformer☆1,629Sep 15, 2023Updated 2 years ago
- Full finetuning of large language models without large memory requirements☆94Sep 22, 2025Updated 6 months ago
- Sample code to show how to create an in-memory RAG☆10Mar 10, 2024Updated 2 years ago
- Seamless Voice Interactions with LLMs☆12Oct 28, 2023Updated 2 years ago
- Customizable implementation of the self-instruct paper.☆1,052Mar 7, 2024Updated 2 years ago
- Go bindings for Langchain AI☆13Apr 11, 2023Updated 2 years ago
- A discord bot that roleplays!☆152Sep 25, 2023Updated 2 years ago
- Image Diffusion block merging technique applied to transformers based Language Models.☆56May 8, 2023Updated 2 years ago
- data cleaning and curation for unstructured text☆329Aug 6, 2024Updated last year
- Conduct consumer interviews with synthetic focus groups using LLMs and LangChain☆43Sep 14, 2023Updated 2 years ago
- ☆415Nov 2, 2023Updated 2 years ago
- ☆50Mar 14, 2024Updated 2 years ago
- Modified Stanford-Alpaca Trainer for Training Replit's Code Model☆43Jun 1, 2023Updated 2 years ago
- Just a bunch of benchmark logs for different LLMs☆119Jul 28, 2024Updated last year
- Rust bindings for CTranslate2☆14Jun 21, 2023Updated 2 years ago
- ☆20Jul 12, 2023Updated 2 years ago
- ☆76Jan 24, 2024Updated 2 years ago
- 🔓 The open-source autonomous agent LLM initiative 🔓☆91Feb 12, 2024Updated 2 years ago
- Steer LLM outputs towards a certain topic/subject and enhance response capabilities using activation engineering by adding steering vecto…☆272Jan 10, 2026Updated 2 months ago
- ☆109Jun 2, 2023Updated 2 years ago
- Repository containing the SPIN experiments on the DIBT 10k ranked prompts☆23Mar 12, 2024Updated 2 years ago
- Camel-Coder: Collaborative task completion with multiple agents. Role-based prompts, intervention mechanism, and thoughtful suggestions☆34Jul 3, 2023Updated 2 years ago
- For converting LLM datasets from one format into another.☆22Nov 12, 2025Updated 4 months ago
- A public release of TimelineBuilder for building personal digital data timelines.☆371Sep 3, 2024Updated last year
- A place to store reusable transformer components of my own creation or found on the interwebs☆75Updated this week
- Official code and dataset for our NAACL 2024 paper: DialogCC: An Automated Pipeline for Creating High-Quality Multi-modal Dialogue Datase…☆13Jun 24, 2024Updated last year
- Source code to accompany research paper on training multi token prediction language models using self-distillation.☆26Feb 21, 2026Updated last month
- Transform is the main building block of data pipelines in fastai. And elsewhere if you want.☆32Jan 29, 2026Updated last month
- QLoRA with Enhanced Multi GPU Support☆38Aug 8, 2023Updated 2 years ago
- ☆64Dec 21, 2024Updated last year
- Automatically research and outbound companies with Exa API and google sheets app scripts.☆18Jun 24, 2024Updated last year
- Merge Transformers language models by use of gradient parameters.☆214Aug 8, 2024Updated last year
- ☆74Sep 5, 2023Updated 2 years ago