google / sentencepieceLinks
Unsupervised text tokenizer for Neural Network-based text generation.
β11,345Updated 2 weeks ago
Alternatives and similar repositories for sentencepiece
Users that are interested in sentencepiece are comparing it to the libraries listed below
Sorting:
- Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"β6,438Updated 5 months ago
- π₯ Fast State-of-the-Art Tokenizers optimized for Research and Productionβ10,147Updated this week
- Open Source Neural Machine Translation and (Large) Language Models in PyTorchβ6,954Updated this week
- An open-source NLP research library, built on PyTorch.β11,880Updated 2 years ago
- XLNet: Generalized Autoregressive Pretraining for Language Understandingβ6,180Updated 2 years ago
- BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)β7,710Updated 4 months ago
- State-of-the-Art Text Embeddingsβ17,685Updated this week
- Models, data loaders and abstractions for language processing, powered by PyTorchβ3,557Updated last month
- Ongoing research training transformer models at scaleβ13,824Updated last week
- The implementation of DeBERTaβ2,158Updated 2 years ago
- π A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iβ¦β9,211Updated this week
- Google AI 2018 BERT pytorch implementationβ6,480Updated 2 years ago
- Facebook AI Research Sequence-to-Sequence Toolkit written in Python.β31,861Updated 2 weeks ago
- Language-Agnostic SEntence Representationsβ3,648Updated last year
- Data augmentation for NLPβ4,620Updated last year
- Unsupervised Word Segmentation for Neural Machine Translation and Text Generationβ2,251Updated last year
- A library for efficient similarity search and clustering of dense vectors.β37,538Updated this week
- Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languagesβ7,628Updated this week
- Code and model for the paper "Improving Language Understanding by Generative Pre-Training"β2,245Updated 6 years ago
- A very simple framework for state-of-the-art Natural Language Processing (NLP)β14,300Updated 2 months ago
- ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generatorsβ2,363Updated last year
- An annotated implementation of the Transformer paper.β6,612Updated last year
- A natural language modeling framework based on PyTorchβ6,319Updated 3 years ago
- PyTorch original implementation of Cross-lingual Language Model Pretraining.β2,922Updated 2 years ago
- ALBERT: A Lite BERT for Self-supervised Learning of Language Representationsβ3,273Updated 2 years ago
- Software in C and data files for the popular GloVe model for distributed word representations, a.k.a. word vectors or embeddingsβ7,114Updated 2 months ago
- TensorFlow Neural Machine Translation Tutorialβ6,441Updated 3 years ago
- Transformers for Information Retrieval, Text Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conveβ¦β4,217Updated last month
- π€ PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.β19,832Updated this week
- β2,895Updated this week