Synthetic Text Dataset Generation for LLM projects
☆58Apr 17, 2026Updated last month
Alternatives and similar repositories for datafast
Users that are interested in datafast are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆14Mar 9, 2023Updated 3 years ago
- Fast search index for SPLADE sparse retrieval models implemented in Python using Numpy and Numba☆38Oct 16, 2025Updated 7 months ago
- GlotEval: a unified evaluation toolkit designed to benchmark multilingual Large Language Models (LLMs) in a language-specific way☆18Nov 4, 2025Updated 6 months ago
- SynthGenAI - Package for Generating Synthetic Datasets using LLMs.☆56Nov 24, 2025Updated 5 months ago
- A blueprint for AI development, focusing on applied examples of RAG, information extraction, analysis and fine-tuning in the age of LLMs …☆65Feb 6, 2025Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- synthetic data for ml☆25Jan 30, 2025Updated last year
- ☆24Jun 5, 2025Updated 11 months ago
- A GitHub App built with Probot that marks/censors Issues and Pull Requests containing offensive content.☆10Dec 16, 2023Updated 2 years ago
- Centralize and streamline ML/AI lifecycle observability and compliance processes.☆12Apr 21, 2026Updated 3 weeks ago
- A CLI for generating synthetic data☆43May 14, 2025Updated last year
- Trully flash implementation of DeBERTa disentangled attention mechanism.☆87Feb 10, 2026Updated 3 months ago
- SynthTextEval: A Toolkit for Generating and Evaluating Synthetic Data For High-Stakes Domains (EMNLP 2025 System Demonstration)☆27Nov 3, 2025Updated 6 months ago
- Extract Molecular SMILES embeddings from language models pre-trained with various objectives architectures.☆19Nov 9, 2023Updated 2 years ago
- ☆162Dec 2, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Official website for the TRON (Token Reduced Object Notation) format☆39Nov 29, 2025Updated 5 months ago
- A Python library for generating and loading synthetic and real-world datasets tailored for graph-based applications.☆36Aug 26, 2025Updated 8 months ago
- Demo of knowledge graph creation and Graph RAG with BAML and Kuzu☆73Sep 17, 2025Updated 8 months ago
- Curriculum training of instruction-following LLMs with Unsloth☆14Dec 15, 2025Updated 5 months ago
- The OS AI engineering and monitoring agent. 🦸♀️ Oversight and compliance copilot for trustworthy AI.☆46Jul 6, 2025Updated 10 months ago
- ☆12Mar 4, 2025Updated last year
- ☆11Sep 27, 2024Updated last year
- Plug-and-play document AI with zero-shot models.☆125May 11, 2026Updated last week
- ☆10Nov 12, 2024Updated last year
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- A compute framework for building Search, RAG, Recommendations and Analytics over complex (structured+unstructured) data, with ultra-modal…☆12Sep 16, 2024Updated last year
- [KDD24-ADS] R-Eval: A Unified Toolkit for Evaluating Domain Knowledge of Retrieval Augmented Large Language Models☆11Apr 9, 2024Updated 2 years ago
- This sample code demonstrates how to build an Amazon SageMaker environment for HPO using Optuna (an open source hyperparameter tuning fra…☆11May 21, 2024Updated last year
- An Easy Annotation Tool for Natural Language Processing☆11May 17, 2024Updated 2 years ago
- PreRanker: reranking tools before tool-use☆21Apr 9, 2025Updated last year
- ☆23Nov 19, 2017Updated 8 years ago
- EvalAssist is an open-source project that simplifies using large language models as evaluators (LLM-as-a-Judge) of the output of other la…☆100Apr 9, 2026Updated last month
- Lite weight wrapper for the independent implementation of SPLADE++ models for search & retrieval pipelines. Models and Library created by…☆34Aug 24, 2024Updated last year
- Build datasets using natural language☆575Sep 19, 2025Updated 8 months ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- A python implementation of discrete optimal transport with a Tsallis entropy regularization.☆14Oct 23, 2023Updated 2 years ago
- ☆17Feb 18, 2026Updated 3 months ago
- Search for a DOI (Digital Object Identifier) in Sci-Hub immediately after selecting it☆17Jun 1, 2019Updated 6 years ago
- 🚀 [ICLR '25] RocketEval: Efficient Automated LLM Evaluation via Grading Checklist☆16Aug 21, 2025Updated 8 months ago
- ☆15May 12, 2025Updated last year
- 360M model running in the browser on WebGPU☆23Aug 20, 2024Updated last year
- Randomly auto-clicks inside of a drawn region☆20Dec 13, 2018Updated 7 years ago