๐ Machine learning dataset loaders for testing and example scripts
โ47Mar 26, 2026Updated 2 months ago
Alternatives and similar repositories for ml-datasets
Users that are interested in ml-datasets are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A crowd-sourcing project by Cambridge Digital Library undertaken during the University of Cambridge's closure period due to the Coronavirโฆโ11Nov 1, 2024Updated last year
- Modular Rust transformer/LLM library using Candleโ39May 5, 2024Updated 2 years ago
- Generate a SQLite database from Wikipedia & Wikidata dumps.โ39Mar 27, 2024Updated 2 years ago
- ๐ Additional lookup tables and data resources for spaCyโ115Jun 4, 2025Updated last year
- Lightweight piece tokenization libraryโ12Apr 15, 2024Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean โข AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- 2018 Computational Text Analysis Notebooks, University of Mannheimโ13Nov 22, 2018Updated 7 years ago
- ๐ธ Use pretrained transformers like BERT, XLNet and GPT-2 in spaCyโ1,407Mar 27, 2026Updated 2 months ago
- SpacyV3 Text Categorizer Tutorialโ17Nov 15, 2020Updated 5 years ago
- A web application tagging and retrieval of arguments in textโ30May 1, 2023Updated 3 years ago
- Extracting Summary Knowledge Graphs from Long Documentsโ19Jul 2, 2021Updated 4 years ago
- ๐ณ Recipes for the Prodigy, our fully scriptable annotation toolโ507Aug 4, 2024Updated last year
- Wrapper for the macOS signpost APIโ17Apr 24, 2023Updated 3 years ago
- spaCy pipeline component for generating spaCy KnowledgeBase Alias Candidates for Entity Linkingโ86Oct 6, 2022Updated 3 years ago
- ๐ฅ Fast matrix-multiplication as a self-contained Python library โ no system dependencies!โ237May 14, 2026Updated 3 weeks ago
- Deploy on Railway without the complexity - Free Credits Offer โข AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ๐งฌ A VS Code extension for annotating data with Prodigyโ30Nov 25, 2021Updated 4 years ago
- TEMPLATE for markdown lessonsโ14Jun 2, 2026Updated last week
- Inter-annotator agreement for Doccanoโ28May 3, 2020Updated 6 years ago
- Python library for Bayesian Autoencodersโ13Jun 10, 2022Updated 4 years ago
- ๐ค Push your spaCy pipelines to the Hugging Face Hubโ45Jun 2, 2024Updated 2 years ago
- ๐ซ Jupyter notebooks for spaCy examples and tutorialsโ288Feb 1, 2019Updated 7 years ago
- Information Extraction Dataset Zoo.โ30Apr 9, 2022Updated 4 years ago
- Introduction to Programming using Pythonโ19Jul 12, 2020Updated 5 years ago
- Python koans for beginner programmersโ18Mar 28, 2015Updated 11 years ago
- Open source password manager - Proton Pass โข AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Bag of, not words, but tricks!โ68Oct 31, 2023Updated 2 years ago
- Given the URL to a public JSON document in an International Image Interoperability Framework (IIIF) image server, this script will downloโฆโ17Sep 6, 2022Updated 3 years ago
- Quantifying interactions with government services to support delivery teams to improve their own products and servicesโ10Sep 5, 2022Updated 3 years ago
- A simulation of civilizations. http://stewart-taylor.github.io/MediSim/โ13Mar 22, 2016Updated 10 years ago
- Code for "All-In-1: Short Text Classification with One Model for All Languages" - Plank (2017), IJCNLP 2017 shared task 4โ16Oct 26, 2017Updated 8 years ago
- An implementation of BERT using PyTorch's TransformerEncoderโ32Dec 15, 2019Updated 6 years ago
- ConceptNet to neo4j 2.2โ10Nov 6, 2015Updated 10 years ago
- Sentiment Corpus for Swedish ๐ธ๐ช Norwegian ๐ณ๐ด Danish ๐ฉ๐ฐ Finnish ๐ซ๐ฎ (and English ๐ด๓ ง๓ ข๓ ฅ๓ ฎ๓ ง๓ ฟ)โ15May 3, 2021Updated 5 years ago
- โ12Apr 13, 2018Updated 8 years ago
- 1-Click AI Models by DigitalOcean Gradient โข AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- 20 python libs and more: read me first!โ12Apr 11, 2024Updated 2 years ago
- TextGraphs-13 Shared Task on Multi-Hop Inference Explanation Regenerationโ44Feb 24, 2020Updated 6 years ago
- An implementation of figlet written in Pythonโ14Sep 20, 2019Updated 6 years ago
- Functional matrix factorization via Bayesian tensor filteringโ13Oct 1, 2025Updated 8 months ago
- โ14Aug 15, 2023Updated 2 years ago
- data comparison tools written in pythonโ16Mar 9, 2018Updated 8 years ago
- Kernel sources for https://huggingface.co/kernels-communityโ125Updated this week