huggingface / tokenizersLinks
π₯ Fast State-of-the-Art Tokenizers optimized for Research and Production
β10,301Updated 2 weeks ago
Alternatives and similar repositories for tokenizers
Users that are interested in tokenizers are comparing it to the libraries listed below
Sorting:
- π A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iβ¦β9,377Updated this week
- π€ The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation toolsβ20,976Updated last week
- Unsupervised text tokenizer for Neural Network-based text generation.β11,508Updated this week
- Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"β6,460Updated last month
- Ongoing research training transformer models at scaleβ14,602Updated this week
- Facebook AI Research Sequence-to-Sequence Toolkit written in Python.β32,020Updated 2 months ago
- Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and moreβ34,299Updated this week
- Development repository for the Triton language and compilerβ17,861Updated this week
- State-of-the-Art Text Embeddingsβ18,008Updated this week
- BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)β7,833Updated 6 months ago
- A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Autoβ¦β16,271Updated this week
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.β41,015Updated this week
- Accessible large language models via k-bit quantization for PyTorch.β7,815Updated this week
- Trax β Deep Learning with Clear Code and Speedβ8,294Updated 2 months ago
- π Accelerate inference and training of π€ Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimizationβ¦β3,215Updated last week
- An open-source NLP research library, built on PyTorch.β11,886Updated 3 years ago
- Simple, safe way to store and distribute tensorsβ3,547Updated this week
- β2,916Updated last week
- π€ Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal modelβ¦β153,866Updated this week
- Flax is a neural network library for JAX that is designed for flexibility.β6,977Updated this week
- Rust native ready-to-use NLP pipelines and transformer-based models (BERT, DistilBERT, GPT2,...)β3,005Updated 6 months ago
- Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalitiesβ21,875Updated 5 months ago
- The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!β8,302Updated this week
- State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterβ¦β14,630Updated last year
- Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languagesβ7,687Updated this week
- Papers & presentation materials from Hugging Face's internal science dayβ2,053Updated 5 years ago
- Serve, optimize and scale PyTorch models in productionβ4,357Updated 4 months ago
- Models, data loaders and abstractions for language processing, powered by PyTorchβ3,560Updated 3 months ago
- A very simple framework for state-of-the-art Natural Language Processing (NLP)β14,333Updated last month
- Large Language Model Text Generation Inferenceβ10,709Updated this week