huggingface / datasetsLinks
π€ The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
β20,207Updated this week
Alternatives and similar repositories for datasets
Users that are interested in datasets are comparing it to the libraries listed below
Sorting:
- π₯ Fast State-of-the-Art Tokenizers optimized for Research and Productionβ9,726Updated this week
- Pretrain, finetune ANY AI model of ANY size on multiple GPUs, TPUs with zero code changes.β29,523Updated this week
- π A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iβ¦β8,771Updated this week
- π€ Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.β144,765Updated this week
- Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalitiesβ21,289Updated 2 months ago
- Label Studio is a multi-type data labeling and annotation tool with standardized output formatβ22,255Updated this week
- Build and share delightful machine learning apps, all in Python. π Star to support our work!β38,236Updated last week
- State-of-the-Art Text Embeddingsβ16,781Updated last week
- A data augmentations library for audio, image, text, and video.β5,007Updated 3 months ago
- Facebook AI Research Sequence-to-Sequence Toolkit written in Python.β31,462Updated 4 months ago
- Train transformer language models with reinforcement learning.β13,971Updated this week
- An open-source NLP research library, built on PyTorch.β11,850Updated 2 years ago
- Notebooks using the Hugging Face libraries π€β4,113Updated this week
- An open-source, low-code machine learning library in Pythonβ9,334Updated last month
- Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Streβ¦β8,632Updated 2 weeks ago
- Learn how to design, develop, deploy and iterate on production-grade ML applications.β38,626Updated 9 months ago
- A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Autoβ¦β14,636Updated this week
- Ongoing research training transformer models at scaleβ12,428Updated this week
- Roadmap to becoming an Artificial Intelligence Expert in 2022β29,877Updated last year
- Unsupervised text tokenizer for Neural Network-based text generation.β10,917Updated last month
- A game theoretic approach to explain the output of any machine learning model.β23,930Updated last week
- π¦ Data Versioning and ML Experimentsβ14,494Updated this week
- Lime: Explaining the predictions of any machine learning classifierβ11,890Updated 10 months ago
- A hyperparameter optimization frameworkβ12,009Updated this week
- A very simple framework for state-of-the-art Natural Language Processing (NLP)β14,182Updated this week
- The fastai deep learning libraryβ26,987Updated last week
- Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"β6,366Updated last month
- BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)β7,415Updated last year
- State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterβ¦β14,288Updated 9 months ago
- GPT-3: Language Models are Few-Shot Learnersβ15,751Updated 4 years ago