MiniXC / opensubtitles-dataloaderLinks
Loads OpenSubtitles v2018 dataset without having to load everything into memory at once. Works well with pytorch.
☆13Updated 5 years ago
Alternatives and similar repositories for opensubtitles-dataloader
Users that are interested in opensubtitles-dataloader are comparing it to the libraries listed below
Sorting:
- A python module for word inflections designed for use with spaCy.☆93Updated 5 years ago
- Conversational text Analysis using various NLP techniques☆180Updated 2 years ago
- Question Generation - Question Answering for Automatic Flashcards☆66Updated 3 years ago
- MILES is a multilingual text simplifier inspired by LSBert - A BERT-based lexical simplification approach proposed in 2018. Unlike LSBert…☆49Updated 4 years ago
- 🕊️ Radically lightweight command-line interfaces☆105Updated 2 years ago
- Experiments with generating GPT-2 fanfiction on specified topics.☆11Updated 6 years ago
- ☆70Updated 2 years ago
- Lightning Fast Language Prediction 🚀☆167Updated last week
- Test prompts for GPT-J-6B and the resulting AI-generated texts☆53Updated 4 years ago
- Abydos NLP/IR library for Python☆188Updated 2 years ago
- The world's largest social media toxicity dataset.☆182Updated 3 years ago
- Benchmark scripts for comparing different tokenizers and sentence segmenters of German☆12Updated 2 years ago
- Lazy, a tool for running things in idle time☆48Updated 4 years ago
- Our open source implementation of MiniLMv2 (https://aclanthology.org/2021.findings-acl.188)☆61Updated 2 years ago
- Confection: the sweetest config system for Python☆188Updated 4 months ago
- Weird A.I. Yankovic neural-net based lyrics parody generator☆84Updated 3 years ago
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.☆53Updated 4 years ago
- A Domain Specific Language (DSL) for building language patterns. These can be later compiled into spaCy patterns, pure regex, or any othe…☆68Updated 2 years ago
- 🔤 Measure edit distance based on keyboard layout☆61Updated last year
- Grammar Induction using a Template Tree Approach☆46Updated 3 months ago
- Cortex-compatible model server for Python and TensorFlow☆17Updated 2 years ago
- Vectory provides a collection of tools to track and compare embedding versions.☆71Updated 2 years ago
- A utility for labeling clusters of text data.☆28Updated 4 years ago
- spaCy match and replace, maintaining conjugation☆35Updated 2 years ago
- Loadable spellfix1 extension for sqlite as python package☆26Updated last year
- Run compute jobs on AWS as if you were running them locally.☆125Updated 3 years ago
- A python package to simulate typographical errors.☆37Updated last year
- Parse natural language time expressions in python☆131Updated 2 years ago
- Next-generation Punkt sentence boundary detection with zero dependencies☆17Updated 3 weeks ago
- ⦠ Angle: new speakable syntax for python 💡☆132Updated last year