huggingface / tokenizersLinks
π₯ Fast State-of-the-Art Tokenizers optimized for Research and Production
β10,431Updated this week
Alternatives and similar repositories for tokenizers
Users that are interested in tokenizers are comparing it to the libraries listed below
Sorting:
- Unsupervised text tokenizer for Neural Network-based text generation.β11,600Updated last week
- π€ The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation toolsβ21,122Updated 2 weeks ago
- π A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iβ¦β9,461Updated 2 weeks ago
- An open-source NLP research library, built on PyTorch.β11,889Updated 3 years ago
- Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"β6,485Updated 2 weeks ago
- Facebook AI Research Sequence-to-Sequence Toolkit written in Python.β32,111Updated 4 months ago
- A very simple framework for state-of-the-art Natural Language Processing (NLP)β14,344Updated 3 months ago
- Papers & presentation materials from Hugging Face's internal science dayβ2,053Updated 5 years ago
- Transformers for Information Retrieval, Text Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conveβ¦β4,227Updated 5 months ago
- Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languagesβ7,717Updated last week
- State-of-the-Art Text Embeddingsβ18,153Updated 3 weeks ago
- The Learning Interpretability Tool: Interactively analyze ML models to understand their behavior in an extensible and framework agnostic β¦β3,629Updated last week
- Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.β30,779Updated this week
- XLNet: Generalized Autoregressive Pretraining for Language Understandingβ6,172Updated 2 years ago
- π Accelerate inference and training of π€ Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimizationβ¦β3,270Updated 2 weeks ago
- π€ Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal modelβ¦β155,627Updated last week
- The implementation of DeBERTaβ2,190Updated 2 years ago
- Ongoing research training transformer models at scaleβ15,016Updated this week
- BertViz: Visualize Attention in Transformer Modelsβ7,887Updated 3 weeks ago
- Accessible large language models via k-bit quantization for PyTorch.β7,912Updated last week
- πͺβKnock Knock: Get notified when your training ends with only two additional lines of codeβ2,823Updated 2 years ago
- π€ Evaluate: A library for easily evaluating machine learning models and datasets.β2,408Updated last week
- An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed librariesβ7,371Updated last month
- Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.β14,663Updated last month
- Simple, safe way to store and distribute tensorsβ3,604Updated 2 weeks ago
- ALBERT: A Lite BERT for Self-supervised Learning of Language Representationsβ3,274Updated 2 years ago
- Natural Language Processing Best Practices & Examplesβ6,443Updated 3 years ago
- A framework for training and evaluating AI models on a variety of openly available dialogue datasets.β10,626Updated 2 years ago
- A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Autoβ¦β16,643Updated this week
- ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generatorsβ2,369Updated last year