huggingface / datasetsLinks
π€ The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools
β20,976Updated last week
Alternatives and similar repositories for datasets
Users that are interested in datasets are comparing it to the libraries listed below
Sorting:
- π₯ Fast State-of-the-Art Tokenizers optimized for Research and Productionβ10,301Updated 2 weeks ago
- Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.β30,561Updated last week
- π A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iβ¦β9,377Updated last week
- π€ Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal modelβ¦β153,866Updated this week
- Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and moreβ34,299Updated this week
- State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterβ¦β14,630Updated last year
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.β41,015Updated this week
- Facebook AI Research Sequence-to-Sequence Toolkit written in Python.β32,020Updated 2 months ago
- π¦ Data Versioning and ML Experimentsβ15,202Updated this week
- Unsupervised text tokenizer for Neural Network-based text generation.β11,508Updated this week
- A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Autoβ¦β16,271Updated this week
- Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)β9,312Updated 3 weeks ago
- A hyperparameter optimization frameworkβ13,217Updated this week
- This repository contains implementations and illustrative code to accompany DeepMind publicationsβ14,552Updated last week
- A very simple framework for state-of-the-art Natural Language Processing (NLP)β14,333Updated last month
- Trax β Deep Learning with Clear Code and Speedβ8,294Updated 2 months ago
- π€ Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.β32,045Updated this week
- The fastai deep learning libraryβ27,719Updated this week
- The open source developer platform to build AI agents and models with confidence. Enhance your AI applications with end-to-end tracking, β¦β23,329Updated this week
- Hydra is a framework for elegantly configuring complex applicationsβ10,036Updated last week
- Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the moβ¦β22,972Updated last year
- Google Researchβ36,886Updated this week
- GPT-3: Language Models are Few-Shot Learnersβ15,775Updated 5 years ago
- Build and share delightful machine learning apps, all in Python. π Star to support our work!β40,937Updated this week
- Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalitiesβ21,875Updated 5 months ago
- Natural Language Processing Best Practices & Examplesβ6,442Updated 3 years ago
- tiktoken is a fast BPE tokeniser for use with OpenAI's models.β16,788Updated 2 months ago
- Flax is a neural network library for JAX that is designed for flexibility.β6,977Updated this week
- Low-code framework for building custom LLMs, neural networks, and other AI modelsβ11,630Updated last week
- AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file convertβ¦β23,593Updated last week