huggingface / tokenizers
π₯ Fast State-of-the-Art Tokenizers optimized for Research and Production
β9,038Updated this week
Related projects β
Alternatives and complementary repositories for tokenizers
- Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"β6,160Updated last month
- Unsupervised text tokenizer for Neural Network-based text generation.β10,252Updated last week
- An open-source NLP research library, built on PyTorch.β11,756Updated last year
- Trax β Deep Learning with Clear Code and Speedβ8,093Updated last month
- TensorFlow code and pre-trained models for BERTβ38,156Updated 3 months ago
- π A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iβ¦β7,911Updated this week
- Serve, optimize and scale PyTorch models in productionβ4,209Updated 2 weeks ago
- Pretrain, finetune ANY AI model of ANY size on multiple GPUs, TPUs with zero code changes.β28,337Updated this week
- State-of-the-Art Text Embeddingsβ15,236Updated this week
- Ongoing research training transformer models at scaleβ10,480Updated this week
- Papers & presentation materials from Hugging Face's internal science dayβ2,035Updated 4 years ago
- Train transformer language models with reinforcement learning.β9,967Updated this week
- π€ The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation toolsβ19,228Updated this week
- Development repository for the Triton language and compilerβ13,311Updated this week
- Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languagesβ7,284Updated last week
- A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Autoβ¦β12,054Updated this week
- Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and moreβ30,426Updated this week
- A very simple framework for state-of-the-art Natural Language Processing (NLP)β13,928Updated 2 weeks ago
- Open source annotation tool for machine learning practitioners.β9,549Updated last week
- Transformers for Information Retrieval, Text Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conveβ¦β4,101Updated 5 months ago
- XLNet: Generalized Autoregressive Pretraining for Language Understandingβ6,181Updated last year
- Flax is a neural network library for JAX that is designed for flexibility.β6,101Updated this week
- Fast and memory-efficient exact attentionβ14,109Updated this week
- A library for efficient similarity search and clustering of dense vectors.β31,320Updated this week
- BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)β6,919Updated last year
- π€ PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.β16,354Updated this week
- A natural language modeling framework based on PyTorchβ6,338Updated 2 years ago
- π Accelerate training and inference of π€ Transformers and π€ Diffusers with easy to use hardware optimization toolsβ2,556Updated this week
- The Learning Interpretability Tool: Interactively analyze ML models to understand their behavior in an extensible and framework agnostic β¦β3,488Updated this week
- ONNX Runtime: cross-platform, high performance ML inferencing and training acceleratorβ14,642Updated this week