Synthetic Text Dataset Generation for LLM projects
☆58Jun 16, 2026Updated last week
Alternatives and similar repositories for datafast
Users that are interested in datafast are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆14Mar 9, 2023Updated 3 years ago
- Fast search index for SPLADE sparse retrieval models implemented in Python using Numpy and Numba☆38Oct 16, 2025Updated 8 months ago
- GlotEval: a unified evaluation toolkit designed to benchmark multilingual Large Language Models (LLMs) in a language-specific way☆18Nov 4, 2025Updated 7 months ago
- SynthGenAI - Package for Generating Synthetic Datasets using LLMs.☆56Nov 24, 2025Updated 7 months ago
- A blueprint for AI development, focusing on applied examples of RAG, information extraction, analysis and fine-tuning in the age of LLMs …☆66Feb 6, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- synthetic data for ml☆25Jan 30, 2025Updated last year
- ☆25Jun 5, 2025Updated last year
- A GitHub App built with Probot that marks/censors Issues and Pull Requests containing offensive content.☆10Dec 16, 2023Updated 2 years ago
- Centralize and streamline ML/AI lifecycle observability and compliance processes.☆12Apr 21, 2026Updated 2 months ago
- A CLI for generating synthetic data☆43May 14, 2025Updated last year
- Extract Molecular SMILES embeddings from language models pre-trained with various objectives architectures.☆19Nov 9, 2023Updated 2 years ago
- Official website for the TRON (Token Reduced Object Notation) format☆43Nov 29, 2025Updated 7 months ago
- ☆162Dec 2, 2024Updated last year
- A Python library for generating and loading synthetic and real-world datasets tailored for graph-based applications.☆36Aug 26, 2025Updated 10 months ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Demo of knowledge graph creation and Graph RAG with BAML and Kuzu☆73Sep 17, 2025Updated 9 months ago
- Curriculum training of instruction-following LLMs with Unsloth☆14Dec 15, 2025Updated 6 months ago
- Evals that meet you where you are. For AI that's grounded.☆68Mar 21, 2026Updated 3 months ago
- NeuroBLAST v3 architecture code☆37Jan 6, 2026Updated 5 months ago
- ☆12Mar 4, 2025Updated last year
- ☆11Sep 27, 2024Updated last year
- Plug-and-play document AI with zero-shot models.☆126May 11, 2026Updated last month
- ☆10Nov 12, 2024Updated last year
- A compute framework for building Search, RAG, Recommendations and Analytics over complex (structured+unstructured) data, with ultra-modal…☆12Sep 16, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- This sample code demonstrates how to build an Amazon SageMaker environment for HPO using Optuna (an open source hyperparameter tuning fra…☆11May 21, 2024Updated 2 years ago
- An Easy Annotation Tool for Natural Language Processing☆11May 17, 2024Updated 2 years ago
- ☆24Nov 19, 2017Updated 8 years ago
- Bypass browser bot detection in langchain tools☆19Apr 22, 2026Updated 2 months ago
- EvalAssist is an open-source project that simplifies using large language models as evaluators (LLM-as-a-Judge) of the output of other la…☆102Apr 9, 2026Updated 2 months ago
- Lite weight wrapper for the independent implementation of SPLADE++ models for search & retrieval pipelines. Models and Library created by…☆35Aug 24, 2024Updated last year
- Build datasets using natural language☆579Sep 19, 2025Updated 9 months ago
- A python implementation of discrete optimal transport with a Tsallis entropy regularization.☆14Oct 23, 2023Updated 2 years ago
- Dataset Viber is your chill repo for data collection, annotation and vibe checks.☆47Sep 5, 2024Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆44Jan 30, 2026Updated 4 months ago
- ☆10Dec 3, 2024Updated last year
- AdFit Web SDK for Publisher☆16Apr 14, 2026Updated 2 months ago
- 360M model running in the browser on WebGPU☆23Aug 20, 2024Updated last year
- ☆12Jul 8, 2021Updated 4 years ago
- ☆15May 26, 2026Updated last month
- an experimental implementation of Burrow's delta in Python 3☆12Jun 6, 2017Updated 9 years ago