google / sentencepiece
Unsupervised text tokenizer for Neural Network-based text generation.
ā10,878Updated last month
Alternatives and similar repositories for sentencepiece
Users that are interested in sentencepiece are comparing it to the libraries listed below
Sorting:
- BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)ā7,389Updated last year
- š„ Fast State-of-the-Art Tokenizers optimized for Research and Productionā9,670Updated last month
- Open Source Neural Machine Translation and (Large) Language Models in PyTorchā6,871Updated 2 months ago
- XLNet: Generalized Autoregressive Pretraining for Language Understandingā6,188Updated last year
- An open-source NLP research library, built on PyTorch.ā11,846Updated 2 years ago
- State-of-the-Art Text Embeddingsā16,638Updated last week
- š Scalable embedding, reasoning, ranking for images and sentences with CLIPā12,665Updated last year
- Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.ā16,118Updated last year
- Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"ā6,349Updated 2 weeks ago
- Ongoing research training transformer models at scaleā12,310Updated this week
- TensorFlow code and pre-trained models for BERTā39,118Updated 9 months ago
- Topic Modelling for Humansā16,013Updated 3 months ago
- Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languagesā7,461Updated 2 weeks ago
- PyTorch original implementation of Cross-lingual Language Model Pretraining.ā2,906Updated 2 years ago
- Google AI 2018 BERT pytorch implementationā6,391Updated last year
- A very simple framework for state-of-the-art Natural Language Processing (NLP)ā14,161Updated last week
- š A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iā¦ā8,708Updated this week
- Software in C and data files for the popular GloVe model for distributed word representations, a.k.a. word vectors or embeddingsā7,015Updated 5 months ago
- A library for efficient similarity search and clustering of dense vectors.ā34,852Updated this week
- ALBERT: A Lite BERT for Self-supervised Learning of Language Representationsā3,270Updated 2 years ago
- Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.ā14,472Updated 3 weeks ago
- A natural language modeling framework based on PyTorchā6,327Updated 2 years ago
- Library for fast text representation and classification.ā26,203Updated last year
- Facebook AI Research Sequence-to-Sequence Toolkit written in Python.ā31,435Updated 4 months ago
- TensorFlow Neural Machine Translation Tutorialā6,423Updated 2 years ago
- ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generatorsā2,353Updated last year
- Fast and memory-efficient exact attentionā17,346Updated last week
- An annotated implementation of the Transformer paper.ā6,220Updated last year
- Accessible large language models via k-bit quantization for PyTorch.ā7,020Updated this week
- Train transformer language models with reinforcement learning.ā13,703Updated this week