huggingface / datasetsLinks
π€ The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools
β21,174Updated this week
Alternatives and similar repositories for datasets
Users that are interested in datasets are comparing it to the libraries listed below
Sorting:
- π€ Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal modelβ¦β156,173Updated this week
- Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.β30,803Updated this week
- π A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iβ¦β9,486Updated this week
- π₯ Fast State-of-the-Art Tokenizers optimized for Research and Productionβ10,445Updated this week
- Facebook AI Research Sequence-to-Sequence Toolkit written in Python.β32,125Updated 4 months ago
- Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Streβ¦β8,993Updated this week
- BertViz: Visualize Attention in Transformer Modelsβ7,908Updated 3 weeks ago
- Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and moreβ34,794Updated this week
- Cleanlab's open-source library is the standard data-centric AI package for data quality and machine learning with messy, real-world data β¦β11,297Updated 3 weeks ago
- Unsupervised text tokenizer for Neural Network-based text generation.β11,627Updated this week
- The open source developer platform to build AI agents and models with confidence. Enhance your AI applications with end-to-end tracking, β¦β23,947Updated this week
- Build and share delightful machine learning apps, all in Python. π Star to support our work!β41,593Updated this week
- Trax β Deep Learning with Clear Code and Speedβ8,305Updated 4 months ago
- ONNX Runtime: cross-platform, high performance ML inferencing and training acceleratorβ19,207Updated this week
- βοΈ Build multimodal AI applications with cloud-native stackβ21,830Updated 10 months ago
- π« Industrial-strength Natural Language Processing (NLP) in Pythonβ33,147Updated 2 months ago
- Train transformer language models with reinforcement learning.β17,297Updated this week
- This repository contains demos I made with the Transformers library by HuggingFace.β11,490Updated 3 weeks ago
- Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"β6,488Updated 3 weeks ago
- An open-source NLP research library, built on PyTorch.β11,889Updated 3 years ago
- π€ PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.β20,587Updated this week
- State-of-the-Art Text Embeddingsβ18,192Updated last week
- State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterβ¦β14,716Updated last year
- Tensors and Dynamic neural networks in Python with strong GPU accelerationβ97,130Updated this week
- This repository contains implementations and illustrative code to accompany DeepMind publicationsβ14,666Updated 2 weeks ago
- Fast and memory-efficient exact attentionβ22,113Updated this week
- AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file convertβ¦β24,076Updated this week
- A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Autoβ¦β16,686Updated this week
- An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.β8,287Updated 3 years ago
- Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languagesβ7,722Updated this week