google / sentencepiece
Unsupervised text tokenizer for Neural Network-based text generation.
☆10,505Updated last month
Alternatives and similar repositories for sentencepiece:
Users that are interested in sentencepiece are comparing it to the libraries listed below
- Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"☆6,241Updated 4 months ago
- State-of-the-Art Text Embeddings☆15,845Updated this week
- BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)☆7,099Updated last year
- 💥 Fast State-of-the-Art Tokenizers optimized for Research and Production☆9,278Updated this week
- An open-source NLP research library, built on PyTorch.☆11,788Updated 2 years ago
- Ongoing research training transformer models at scale☆11,192Updated this week
- XLNet: Generalized Autoregressive Pretraining for Language Understanding☆6,183Updated last year
- Facebook AI Research Sequence-to-Sequence Toolkit written in Python.☆30,852Updated 2 weeks ago
- A very simple framework for state-of-the-art Natural Language Processing (NLP)☆14,034Updated this week
- 🏄 Scalable embedding, reasoning, ranking for images and sentences with CLIP☆12,543Updated last year
- A library for efficient similarity search and clustering of dense vectors.☆32,556Updated this week
- Data augmentation for NLP☆4,493Updated 7 months ago
- A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Auto…☆12,961Updated this week
- Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.☆15,758Updated last year
- Open Source Neural Machine Translation and (Large) Language Models in PyTorch☆6,806Updated 2 weeks ago
- A framework for training and evaluating AI models on a variety of openly available dialogue datasets.☆10,498Updated last year
- Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.☆14,344Updated last month
- TensorFlow code and pre-trained models for BERT☆38,562Updated 6 months ago
- 🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (i…☆8,208Updated this week
- Trax — Deep Learning with Clear Code and Speed☆8,149Updated last week
- Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages☆7,349Updated this week
- PyTorch original implementation of Cross-lingual Language Model Pretraining.☆2,898Updated last year
- Topic Modelling for Humans☆15,804Updated last month
- 🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.☆17,069Updated this week
- Unsupervised Word Segmentation for Neural Machine Translation and Text Generation☆2,215Updated 5 months ago
- ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators☆2,345Updated 10 months ago
- Pretrain, finetune ANY AI model of ANY size on multiple GPUs, TPUs with zero code changes.☆28,846Updated this week
- A natural language modeling framework based on PyTorch☆6,332Updated 2 years ago
- Code for the paper "Language Models are Unsupervised Multitask Learners"☆22,847Updated 5 months ago
- An annotated implementation of the Transformer paper.☆5,935Updated 9 months ago