huggingface / datasetsLinks

🤗 The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools

☆20,921

Alternatives and similar repositories for datasets

Users that are interested in datasets are comparing it to the libraries listed below

Sorting:

huggingface / transformers
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal model…
☆153,203Updated this week
huggingface / tokenizers
💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
☆10,252Updated this week
google / sentencepiece
Unsupervised text tokenizer for Neural Network-based text generation.
☆11,474Updated last week
huggingface / accelerate
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (i…
☆9,329Updated this week
microsoft / unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
☆21,851Updated 5 months ago
facebookresearch / metaseq
Repo for external large-scale work
☆6,548Updated last year
google-research / text-to-text-transfer-transformer
Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
☆6,455Updated 3 weeks ago
NVIDIA / DeepLearningExamples
State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enter…
☆14,602Updated last year
deepspeedai / DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
☆40,890Updated last week
huggingface / sentence-transformers
State-of-the-Art Text Embeddings
☆17,942Updated last week
NVIDIA / Megatron-LM
Ongoing research training transformer models at scale
☆14,389Updated this week
Lightning-AI / pytorch-lightning
Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.
☆30,530Updated last week
facebookresearch / ParlAI
A framework for training and evaluating AI models on a variety of openly available dialogue datasets.
☆10,625Updated 2 years ago
google / trax
Trax — Deep Learning with Clear Code and Speed
☆8,295Updated 2 months ago
jessevig / bertviz
BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)
☆7,797Updated 6 months ago
huggingface / notebooks
Notebooks using the Hugging Face libraries 🤗
☆4,388Updated this week
facebookresearch / fairseq
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
☆32,008Updated 2 months ago
huggingface / optimum
🚀 Accelerate inference and training of 🤗 Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimization…
☆3,192Updated 2 weeks ago
ThilinaRajapakse / simpletransformers
Transformers for Information Retrieval, Text Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conve…
☆4,229Updated 3 months ago
google-research / google-research
Google Research
☆36,832Updated this week
sebastianruder / NLP-progress
Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the mo…
☆22,966Updated last year
EleutherAI / gpt-neox
An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries
☆7,343Updated 2 months ago
allenai / allennlp
An open-source NLP research library, built on PyTorch.
☆11,887Updated 3 years ago
wandb / wandb
The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production.
☆10,576Updated last week
huggingface / peft
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
☆20,157Updated last week
PAIR-code / lit
The Learning Interpretability Tool: Interactively analyze ML models to understand their behavior in an extensible and framework agnostic …
☆3,614Updated this week
huggingface / blog
Public repo for HF blog posts
☆3,212Updated this week
openai / gpt-3
GPT-3: Language Models are Few-Shot Learners
☆15,778Updated 5 years ago
huggingface / diffusers
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
☆31,812Updated this week
microsoft / onnxruntime
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
☆18,504Updated this week