huggingface / datasets
π€ The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
β20,083Updated this week
Alternatives and similar repositories for datasets:
Users that are interested in datasets are comparing it to the libraries listed below
- π A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iβ¦β8,673Updated last week
- π€ Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.β143,804Updated this week
- Pretrain, finetune ANY AI model of ANY size on multiple GPUs, TPUs with zero code changes.β29,414Updated this week
- π₯ Fast State-of-the-Art Tokenizers optimized for Research and Productionβ9,659Updated 3 weeks ago
- The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production.β9,834Updated this week
- The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights --β¦β34,049Updated this week
- Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and moreβ32,148Updated this week
- State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterβ¦β14,217Updated 8 months ago
- Google Researchβ35,463Updated this week
- PyTorch Tutorial for Deep Learning Researchersβ31,218Updated last year
- Ongoing research training transformer models at scaleβ12,261Updated this week
- An open-source NLP research library, built on PyTorch.β11,843Updated 2 years ago
- Hydra is a framework for elegantly configuring complex applicationsβ9,282Updated last week
- AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file convertβ¦β20,589Updated this week
- π€ PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.β18,320Updated this week
- Facebook AI Research Sequence-to-Sequence Toolkit written in Python.β31,390Updated 4 months ago
- A library for efficient similarity search and clustering of dense vectors.β34,749Updated this week
- Development repository for the Triton language and compilerβ15,447Updated this week
- Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Streβ¦β8,584Updated 2 weeks ago
- A data augmentations library for audio, image, text, and video.β5,006Updated 2 months ago
- Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalitiesβ21,193Updated 2 months ago
- An open-source, low-code machine learning library in Pythonβ9,311Updated 2 weeks ago
- The official Python client for the Huggingface Hub.β2,573Updated last week
- Open standard for machine learning interoperabilityβ18,895Updated this week
- Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.β14,464Updated 2 weeks ago
- BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)β7,373Updated last year
- This repository contains implementations and illustrative code to accompany DeepMind publicationsβ13,760Updated 3 weeks ago
- π« Industrial-strength Natural Language Processing (NLP) in Pythonβ31,509Updated 3 weeks ago
- Label Studio is a multi-type data labeling and annotation tool with standardized output formatβ21,957Updated this week
- CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an imageβ28,780Updated 9 months ago