huggingface / tokenizersLinks
π₯ Fast State-of-the-Art Tokenizers optimized for Research and Production
β10,445Updated this week
Alternatives and similar repositories for tokenizers
Users that are interested in tokenizers are comparing it to the libraries listed below
Sorting:
- Unsupervised text tokenizer for Neural Network-based text generation.β11,627Updated this week
- Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"β6,488Updated 3 weeks ago
- Facebook AI Research Sequence-to-Sequence Toolkit written in Python.β32,143Updated 4 months ago
- π€ The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation toolsβ21,174Updated this week
- An open-source NLP research library, built on PyTorch.β11,889Updated 3 years ago
- State-of-the-Art Text Embeddingsβ18,192Updated last week
- π A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iβ¦β9,486Updated this week
- Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.β30,803Updated last week
- Open standard for machine learning interoperabilityβ20,269Updated this week
- Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalitiesβ22,002Updated 2 weeks ago
- A natural language modeling framework based on PyTorchβ6,309Updated 3 years ago
- BertViz: Visualize Attention in Transformer Modelsβ7,908Updated last month
- π€ Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal modelβ¦β156,173Updated this week
- Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languagesβ7,722Updated this week
- Ongoing research training transformer models at scaleβ15,100Updated last week
- Trax β Deep Learning with Clear Code and Speedβ8,305Updated 4 months ago
- Models, data loaders and abstractions for language processing, powered by PyTorchβ3,565Updated 4 months ago
- A library for efficient similarity search and clustering of dense vectors.β38,999Updated this week
- π Accelerate inference and training of π€ Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimizationβ¦β3,277Updated 3 weeks ago
- A very simple framework for state-of-the-art Natural Language Processing (NLP)β14,352Updated 3 months ago
- π Scalable embedding, reasoning, ranking for images and sentences with CLIPβ12,818Updated 2 years ago
- Data augmentation for NLPβ4,645Updated last year
- Language-Agnostic SEntence Representationsβ3,657Updated last year
- Transformers for Information Retrieval, Text Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conveβ¦β4,229Updated 5 months ago
- Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the moβ¦β22,978Updated last year
- Simple, safe way to store and distribute tensorsβ3,614Updated this week
- Open Source Neural Machine Translation and (Large) Language Models in PyTorchβ6,990Updated 3 months ago
- ALBERT: A Lite BERT for Self-supervised Learning of Language Representationsβ3,274Updated 2 years ago
- The Learning Interpretability Tool: Interactively analyze ML models to understand their behavior in an extensible and framework agnostic β¦β3,631Updated last week
- Papers & presentation materials from Hugging Face's internal science dayβ2,053Updated 5 years ago