MiniXC / opensubtitles-dataloaderLinks
Loads OpenSubtitles v2018 dataset without having to load everything into memory at once. Works well with pytorch.
β13Updated 5 years ago
Alternatives and similar repositories for opensubtitles-dataloader
Users that are interested in opensubtitles-dataloader are comparing it to the libraries listed below
Sorting:
- Test prompts for GPT-J-6B and the resulting AI-generated textsβ53Updated 4 years ago
- π€ Measure edit distance based on keyboard layoutβ64Updated 4 months ago
- A python module for word inflections designed for use with spaCy.β93Updated 6 years ago
- spaCy match and replace, maintaining conjugationβ36Updated 3 years ago
- A slightly opinionated iPython profile for interactive developmentβ23Updated 3 years ago
- Visual Automata is a Python 3 library built as a wrapper for the Automata library to add more visualization features.β57Updated 2 years ago
- ποΈ Radically lightweight command-line interfacesβ108Updated 5 months ago
- Conversational text Analysis using various NLP techniquesβ182Updated 2 years ago
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.β58Updated 4 years ago
- NoPdb: Non-interactive Python Debuggerβ84Updated 3 years ago
- A utility for labeling clusters of text data.β28Updated 4 years ago
- Abydos NLP/IR library for Pythonβ194Updated 3 years ago
- Vectory provides a collection of tools to track and compare embedding versions.β71Updated 3 years ago
- β70Updated 3 years ago
- β18Updated 3 years ago
- Efficiently computing & storing token n-grams from large corporaβ26Updated last year
- Flenser is a simple, minimal, automated exploratory data analysis tool.β78Updated 9 months ago
- A corpus of Python programs annotated with contractsβ25Updated 3 months ago
- Lightning Fast Language Prediction πβ167Updated 5 months ago
- Parse numbers written in natural languageβ126Updated last year
- Run compute jobs on AWS as if you were running them locally.β124Updated 4 years ago
- Python wrapper for Ferretβ45Updated 4 years ago
- Confection: the sweetest config system for Pythonβ193Updated this week
- Finds linguistic patterns effortlesslyβ39Updated 2 years ago
- Extremely easy to use sequence to sequence library with attention, for text to text conversion tasks.β39Updated 5 years ago
- fasttext with wheels and no external dependency, but only the predict method (<1MB)β19Updated last year
- a relational algebra shellβ20Updated 3 years ago
- MILES is a multilingual text simplifier inspired by LSBert - A BERT-based lexical simplification approach proposed in 2018. Unlike LSBertβ¦β50Updated 4 years ago
- Composable, first-order file-to-file transformations in Pythonβ33Updated 4 years ago
- Language detection using Spacy and Fasttextβ57Updated 2 years ago