AnswerDotAI/fastdata

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/AnswerDotAI/fastdata)

AnswerDotAI / fastdata

☆160

Alternatives and similar repositories for fastdata

Users that are interested in fastdata are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Pleias / OCRoscope
View on GitHub
Small python package to measure OCR quality and other related metrics.
☆26Feb 19, 2024Updated 2 years ago
lightonai / ducksearch
View on GitHub
Efficient BM25 with DuckDB 🦆
☆68Dec 20, 2024Updated last year
AnswerDotAI / claudette
View on GitHub
Claudette is Claude's friend
☆316Jul 11, 2026Updated last week
AnswerDotAI / fastlite
View on GitHub
A bit of extra usability for sqlite
☆229Jul 11, 2026Updated last week
fsndzomga / open_source_lrm
View on GitHub
☆10Oct 24, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
AnswerDotAI / byaldi
View on GitHub
Use late-interaction multi-modal models such as ColPali in just a few lines of code.
☆850Jan 28, 2025Updated last year
AnswerDotAI / rerankers
View on GitHub
A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.
☆1,624Dec 20, 2025Updated 7 months ago
AnswerDotAI / fastcaddy
View on GitHub
A simple python wrapper for using the Caddy API
☆27Jul 13, 2026Updated last week
SmallDoges / small-datasets
View on GitHub
Distill thinking dataset more compactly and accurately!
☆38Jun 6, 2025Updated last year
arcee-ai / DAM
View on GitHub
☆56Nov 6, 2024Updated last year
huggingface / dataset-dedupe-estimator
View on GitHub
parquet dedupe estimator
☆26May 26, 2026Updated last month
KRLabsOrg / rulechef
View on GitHub
Learn rule-based models from examples using LLM-powered synthesis. Replace expensive LLM calls with fast, deterministic, inspectable rege…
☆29Jul 10, 2026Updated last week
Pleias / marginalia
View on GitHub
☆67Mar 4, 2024Updated 2 years ago
AnswerDotAI / fastkmeans
View on GitHub
☆101Jul 4, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
AnswerDotAI / msglm
View on GitHub
msglm makes it a little easier to create messages for language models like Claude and OpenAI GPTs.
☆15Apr 6, 2026Updated 3 months ago
mixedbread-ai / baguetter
View on GitHub
Baguetter is a flexible, efficient, and hackable search engine library implemented in Python. It's designed for quickly benchmarking, imp…
☆210Aug 31, 2024Updated last year
dream3d-ai / torch-submit
View on GitHub
☆10Dec 21, 2024Updated last year
davanstrien / awesome-synthetic-datasets
View on GitHub
awesome synthetic (text) datasets
☆335Jan 8, 2026Updated 6 months ago
BhabhaAI / dataformer
View on GitHub
Solving data for LLMs - Create quality synthetic datasets!
☆152Jan 20, 2025Updated last year
MeLeLBGU / SaGe
View on GitHub
Code for SaGe subword tokenizer (EACL 2023)
☆28Nov 30, 2024Updated last year
davidberenstein1957 / dataset-viber
View on GitHub
Dataset Viber is your chill repo for data collection, annotation and vibe checks.
☆47Sep 5, 2024Updated last year
lancedb / lerobot-lancedb
View on GitHub
☆21Jul 11, 2026Updated last week
AnswerDotAI / GeminiSave
View on GitHub
☆54Apr 13, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
AnswerDotAI / RAGatouille
View on GitHub
Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-…
☆3,939May 17, 2025Updated last year
MinishLab / semhash
View on GitHub
Fast Multimodal Semantic Deduplication & Filtering
☆945May 24, 2026Updated last month
hamelsmu / nbsanity
View on GitHub
Render notebooks like nbviewer, but using Quarto as the renderer
☆110May 5, 2025Updated last year
enjalot / latent-data-modal
View on GitHub
Using modal.com to process FineWeb-edu data
☆20Apr 11, 2026Updated 3 months ago
AnswerDotAI / shell_sage
View on GitHub
ShellSage saves sysadmins’ sanity by solving shell script snafus super swiftly
☆406Updated this week
mixedbread-ai / maxsim-cpu
View on GitHub
☆57Jul 10, 2025Updated last year
AnswerDotAI / ModernBERT
View on GitHub
Bringing BERT into modernity via both architecture changes and scaling
☆1,700Mar 1, 2026Updated 4 months ago
EthanBnntt / tinygrad-vit
View on GitHub
A minimalist implementation of the ViT (Vision Transformer) model, using tinygrad
☆17Sep 2, 2024Updated last year
diicellman / dspy-gradio-rag
View on GitHub
RAG example using DSPy, Gradio, FastAPI
☆93Apr 11, 2024Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
lightonai / pylate
View on GitHub
Late Interaction Models Training & Retrieval
☆875Jul 13, 2026Updated last week
AnswerDotAI / FastHTML-Gallery
View on GitHub
☆108May 28, 2025Updated last year
huggingface / yourbench
View on GitHub
🤗 Benchmark Large Language Models Reliably On Your Data
☆450Apr 2, 2026Updated 3 months ago
huggingface / lighteval
View on GitHub
Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends
☆2,483Jun 29, 2026Updated 3 weeks ago
AnswerDotAI / llm-ctx
View on GitHub
Create an LLM XML context document from an llms.txt file
☆23Aug 26, 2024Updated last year
cfahlgren1 / hf-data-explorer
View on GitHub
Chrome Extension for exploring Hugging Face datasets 🔎
☆48Sep 18, 2024Updated last year
parlance-labs / mcp-llms.txt
View on GitHub
Minimal example of MCP for parsing llms.txt
☆39Apr 8, 2025Updated last year