Solving data for LLMs - Create quality synthetic datasets!
☆151Jan 20, 2025Updated last year
Alternatives and similar repositories for dataformer
Users that are interested in dataformer are comparing it to the libraries listed below
Sorting:
- ☆31Jan 18, 2025Updated last year
- Small, simple agent task environments for training and evaluation☆19Nov 1, 2024Updated last year
- A reading list on LLM based Synthetic Data Generation 🔥☆1,517Jun 5, 2025Updated 8 months ago
- A Python library to orchestrate LLMs in a neural network-inspired structure☆52Oct 4, 2024Updated last year
- An attribution library for LLMs☆46Sep 17, 2024Updated last year
- ☆10Oct 24, 2024Updated last year
- OO for LLMs☆898Updated this week
- A Ruby on Rails style framework for the DSPy (Demonstrate, Search, Predict) project for Language Models like GPT, BERT, and LLama.☆132Oct 16, 2024Updated last year
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆16Jun 16, 2024Updated last year
- awesome synthetic (text) datasets☆325Jan 8, 2026Updated last month
- A toolkit for building computer use AI agents☆183Jun 26, 2025Updated 8 months ago
- Manipulating Python Programs☆709Jan 14, 2026Updated last month
- ☆15Dec 22, 2023Updated 2 years ago
- Using modal.com to process FineWeb-edu data☆20Apr 5, 2025Updated 10 months ago
- Recipes for learning, fine-tuning, and adapting ColPali to your multimodal RAG use cases. 👨🏻🍳☆353Jun 2, 2025Updated 8 months ago
- OpenAI GPT hosted Agent Framework for Windows and MacOS☆36Jul 8, 2024Updated last year
- A comprehensive repository of reasoning tasks for LLMs (and beyond)☆459Sep 27, 2024Updated last year
- vLLM with support for span semantics☆22Updated this week
- Simple orchestration for EC2 spot containers☆19Sep 27, 2024Updated last year
- Common tools for data processing☆22Dec 8, 2025Updated 2 months ago
- An LLM playground similar to the OpenAI API playground☆22Dec 26, 2023Updated 2 years ago
- ☆162Dec 2, 2024Updated last year
- ☆25May 7, 2025Updated 9 months ago
- Leveraging DSPy for AI-driven task understanding and solution generation, the Self-Discover Framework automates problem-solving through r…☆74Nov 4, 2025Updated 3 months ago
- Use late-interaction multi-modal models such as ColPali in just a few lines of code.☆845Jan 28, 2025Updated last year
- PyTorch Implementation of Context-Aware Sequential Model for Multi-Behaviour Recommendation https://arxiv.org/abs/2312.09684☆10May 31, 2024Updated last year
- the framework/ sdk that lets you build browser controlling agents in 3 lines of code. join chat @ https://discord.gg/umgnyQU2K8☆567Oct 10, 2024Updated last year
- In-Context Learning for eXtreme Multi-Label Classification (XMC) using only a handful of examples.☆449Feb 13, 2024Updated 2 years ago
- Repository containing the SPIN experiments on the DIBT 10k ranked prompts☆23Mar 12, 2024Updated last year
- DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. 🤖💤☆1,097Feb 2, 2025Updated last year
- Structured information extraction from documents☆318Sep 26, 2024Updated last year
- Flexible and powerful multi-agent AI framework☆395Jan 5, 2026Updated last month
- An experiment in meeting transcription and diarization with just an LLM. Maybe I went a little overboard though☆566Nov 20, 2025Updated 3 months ago
- Makes it easy to use altair from FastHTML☆28Oct 9, 2024Updated last year
- An intuitive LLM prompting framework for multifunctional agents, by explicitly constructing a complex "thought process" from simple natur…☆516Dec 20, 2024Updated last year
- Jason Meridth's blog☆13Updated this week
- ☆15Apr 26, 2025Updated 10 months ago
- Repository for tw.org site☆14Feb 11, 2026Updated 2 weeks ago
- ALAS: Autonomous Learning Agent System☆15Aug 14, 2025Updated 6 months ago