Synthetic Text Dataset Generation for LLM projects
☆58Apr 17, 2026Updated last week
Alternatives and similar repositories for datafast
Users that are interested in datafast are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Fast search index for SPLADE sparse retrieval models implemented in Python using Numpy and Numba☆38Oct 16, 2025Updated 6 months ago
- ☆28Feb 11, 2026Updated 2 months ago
- SynthGenAI - Package for Generating Synthetic Datasets using LLMs.☆56Nov 24, 2025Updated 5 months ago
- A blueprint for AI development, focusing on applied examples of RAG, information extraction, analysis and fine-tuning in the age of LLMs …☆66Feb 6, 2025Updated last year
- ☆24Jun 5, 2025Updated 10 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Centralize and streamline ML/AI lifecycle observability and compliance processes.☆12Apr 21, 2026Updated last week
- A curated list of materials on AI guardrails☆52Jun 3, 2025Updated 10 months ago
- Extract Molecular SMILES embeddings from language models pre-trained with various objectives architectures.☆18Nov 9, 2023Updated 2 years ago
- ☆162Dec 2, 2024Updated last year
- A Python library for generating and loading synthetic and real-world datasets tailored for graph-based applications.☆37Aug 26, 2025Updated 8 months ago
- Demo of knowledge graph creation and Graph RAG with BAML and Kuzu☆73Sep 17, 2025Updated 7 months ago
- Curriculum training of instruction-following LLMs with Unsloth☆14Dec 15, 2025Updated 4 months ago
- The OS AI engineering and monitoring agent. 🦸♀️ Oversight and compliance copilot for trustworthy AI.☆46Jul 6, 2025Updated 9 months ago
- NeuroBLAST v3 architecture code☆37Jan 6, 2026Updated 3 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆12Mar 4, 2025Updated last year
- ☆11Sep 27, 2024Updated last year
- Feature Selection using Simulated Annealing☆11Aug 10, 2022Updated 3 years ago
- A compute framework for building Search, RAG, Recommendations and Analytics over complex (structured+unstructured) data, with ultra-modal…☆11Sep 16, 2024Updated last year
- [KDD24-ADS] R-Eval: A Unified Toolkit for Evaluating Domain Knowledge of Retrieval Augmented Large Language Models☆11Apr 9, 2024Updated 2 years ago
- This sample code demonstrates how to build an Amazon SageMaker environment for HPO using Optuna (an open source hyperparameter tuning fra…☆11May 21, 2024Updated last year
- An Easy Annotation Tool for Natural Language Processing☆11May 17, 2024Updated last year
- PreRanker: reranking tools before tool-use☆21Apr 9, 2025Updated last year
- SleepLM: Natural-Language Intelligence for Human Sleep☆35Mar 10, 2026Updated last month
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- EvalAssist is an open-source project that simplifies using large language models as evaluators (LLM-as-a-Judge) of the output of other la…☆98Apr 9, 2026Updated 2 weeks ago
- Lite weight wrapper for the independent implementation of SPLADE++ models for search & retrieval pipelines. Models and Library created by…☆34Aug 24, 2024Updated last year
- Build datasets using natural language☆573Sep 19, 2025Updated 7 months ago
- A python implementation of discrete optimal transport with a Tsallis entropy regularization.☆14Oct 23, 2023Updated 2 years ago
- ☆40Jan 30, 2026Updated 2 months ago
- Large language models for document ranking.☆73Apr 16, 2026Updated last week
- ☆10Dec 3, 2024Updated last year
- 🚀 [ICLR '25] RocketEval: Efficient Automated LLM Evaluation via Grading Checklist☆16Aug 21, 2025Updated 8 months ago
- ☆14May 12, 2025Updated 11 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A Tiptap extension for adding embedded content with Iframely.☆16Nov 18, 2025Updated 5 months ago
- an experimental implementation of Burrow's delta in Python 3☆12Jun 6, 2017Updated 8 years ago
- A Python library aimed at dissecting and augmenting NER training data.☆61May 11, 2023Updated 2 years ago
- A lightweight, Vercel AI SDK based coding agent library.☆51Feb 23, 2026Updated 2 months ago
- Python code for implementing embeddings in the Wasserstein space of elliptical distributions☆11Jul 22, 2020Updated 5 years ago
- Demonstrate using MCP with Pydantic AI framework☆14Mar 14, 2025Updated last year
- Poetry Corpora Annotated on Aesthetic Emotions☆12Aug 2, 2022Updated 3 years ago