huggingface / tokenizersLinks
π₯ Fast State-of-the-Art Tokenizers optimized for Research and Production
β10,445Updated this week
Alternatives and similar repositories for tokenizers
Users that are interested in tokenizers are comparing it to the libraries listed below
Sorting:
- Unsupervised text tokenizer for Neural Network-based text generation.β11,627Updated this week
- Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"β6,488Updated 3 weeks ago
- A very simple framework for state-of-the-art Natural Language Processing (NLP)β14,352Updated 3 months ago
- An open-source NLP research library, built on PyTorch.β11,889Updated 3 years ago
- State-of-the-Art Text Embeddingsβ18,192Updated last week
- π A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iβ¦β9,486Updated this week
- π€ The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation toolsβ21,174Updated this week
- π€ Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal modelβ¦β156,173Updated this week
- π Accelerate inference and training of π€ Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimizationβ¦β3,277Updated 3 weeks ago
- Rust native ready-to-use NLP pipelines and transformer-based models (BERT, DistilBERT, GPT2,...)β3,030Updated 3 weeks ago
- A natural language modeling framework based on PyTorchβ6,311Updated 3 years ago
- Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languagesβ7,722Updated this week
- ONNX Runtime: cross-platform, high performance ML inferencing and training acceleratorβ19,207Updated this week
- Development repository for the Triton language and compilerβ18,319Updated this week
- Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and moreβ34,794Updated this week
- Trax β Deep Learning with Clear Code and Speedβ8,305Updated 4 months ago
- Serve, optimize and scale PyTorch models in productionβ4,358Updated 6 months ago
- π« Industrial-strength Natural Language Processing (NLP) in Pythonβ33,147Updated 2 months ago
- The implementation of DeBERTaβ2,189Updated 2 years ago
- β2,948Updated 3 weeks ago
- Papers & presentation materials from Hugging Face's internal science dayβ2,053Updated 5 years ago
- Facebook AI Research Sequence-to-Sequence Toolkit written in Python.β32,125Updated 4 months ago
- π Scalable embedding, reasoning, ranking for images and sentences with CLIPβ12,818Updated 2 years ago
- A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Autoβ¦β16,686Updated this week
- Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.β30,803Updated this week
- PyTorch extensions for high performance and large scale training.β3,397Updated 9 months ago
- Simple, safe way to store and distribute tensorsβ3,614Updated this week
- BertViz: Visualize Attention in Transformer Modelsβ7,908Updated 3 weeks ago
- Transformers for Information Retrieval, Text Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conveβ¦β4,229Updated 5 months ago
- Models, data loaders and abstractions for language processing, powered by PyTorchβ3,565Updated 4 months ago