a pipeline for using api calls to agnostically convert unstructured data into structured training data
☆32Sep 22, 2024Updated last year
Alternatives and similar repositories for datagen
Users that are interested in datagen are comparing it to the libraries listed below
Sorting:
- quick playground to animate pippin☆14Nov 11, 2024Updated last year
- ☆22Aug 27, 2023Updated 2 years ago
- Tool to apply Legal Matter Specification Standard (LMSS) to documents☆12Aug 15, 2024Updated last year
- The Ultimate OpenCode Starter Kit. Includes Oh My OpenCode config, Superpowers installation fix, MCP Setup, and Windows Crash Fix (exit_c…☆17Feb 10, 2026Updated 2 weeks ago
- ☆11Aug 26, 2024Updated last year
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.☆83Sep 10, 2023Updated 2 years ago
- ☆14Mar 28, 2024Updated last year
- Simple Implementation of TinyGPTV in super simple Zeta lego blocks☆16Nov 11, 2024Updated last year
- ☆13May 7, 2023Updated 2 years ago
- Tiktok is an advanced multimedia recommender system that fuses the generative modality-aware collaborative self-augmentation and contrast…☆14Aug 18, 2023Updated 2 years ago
- AutoGPT maintainer/reviewer system☆16May 26, 2023Updated 2 years ago
- Minimal Implimentation of VCRec (2024) for collapse provention.☆18Jan 28, 2025Updated last year
- AgentOS is a lightweight, single-file implementation that provides a robust foundation for building autonomous AI agents. It implements t…☆22Jul 11, 2025Updated 7 months ago
- Starter template for python projects☆18Feb 15, 2024Updated 2 years ago
- ☆45Oct 13, 2023Updated 2 years ago
- The Elasticsearch adapter for Microsoft Kernel Memory.☆19Aug 1, 2024Updated last year
- Calling LLM APIs on a Raspberry Pi for lulz☆24Apr 17, 2023Updated 2 years ago
- ☆22Sep 18, 2023Updated 2 years ago
- ☆63Sep 23, 2024Updated last year
- A framework for building large-scale, deterministic, interactive workflows with a fault-tolerant, conversational UX☆44Feb 5, 2026Updated 3 weeks ago
- A repository re-creating the PromptBreeder Evolutionary Algorithm from the DeepMind Paper in Python using LMQL as the backend.☆27Oct 27, 2023Updated 2 years ago
- Experimental scripts for researching data adaptive learning rate scheduling.☆22Oct 18, 2023Updated 2 years ago
- Synthetic Data Generation using LLM via Argilla, Distilabel, ChatGPT, etc.☆30May 29, 2024Updated last year
- A red teaming agent☆18Oct 15, 2025Updated 4 months ago
- ☆31Jan 23, 2026Updated last month
- Latent Diffusion Language Models☆70Sep 20, 2023Updated 2 years ago
- Official Pytorch Implementation of Self-emerging Token Labeling☆35Mar 27, 2024Updated last year
- never forget anything again! combine AI and intelligent tooling for a local knowledge base to track catalogue, annotate, and plan for you…☆37May 14, 2024Updated last year
- Repo to reproduce the First-Explore paper results☆39Dec 25, 2024Updated last year
- Libraries, guides, blueprints, and sample code, to enable rapidly building 0-1 applications on iOS, Android and web.☆11May 12, 2023Updated 2 years ago
- Generate Synthetic Data Using OpenAI, MistralAI or AnthropicAI☆222Apr 29, 2024Updated last year
- MCP server for Google search and page fetching using headless Chromium☆67Feb 21, 2026Updated last week
- An Educational Framework Based on PyTorch for Deep Learning Education and Exploration☆10Dec 24, 2023Updated 2 years ago
- A toy AI agent that can write programs, powered by Dagger☆38Apr 1, 2025Updated 10 months ago
- Recycling diverse models☆46Jan 18, 2023Updated 3 years ago
- A library for squeakily cleaning and filtering language datasets.☆50Jul 10, 2023Updated 2 years ago
- Generalised UDRL☆37May 12, 2022Updated 3 years ago
- Various transformers for FSDP research☆38Nov 11, 2022Updated 3 years ago
- [ICML 2023] "Outline, Then Details: Syntactically Guided Coarse-To-Fine Code Generation", Wenqing Zheng, S P Sharan, Ajay Kumar Jaiswal, …☆43Nov 9, 2023Updated 2 years ago