MiniXC / opensubtitles-dataloaderLinks
Loads OpenSubtitles v2018 dataset without having to load everything into memory at once. Works well with pytorch.
☆13Updated 4 years ago
Alternatives and similar repositories for opensubtitles-dataloader
Users that are interested in opensubtitles-dataloader are comparing it to the libraries listed below
Sorting:
- Conversational text Analysis using various NLP techniques☆180Updated 2 years ago
- A slightly opinionated iPython profile for interactive development☆23Updated 3 years ago
- 🕊️ Radically lightweight command-line interfaces☆107Updated 2 years ago
- Test prompts for GPT-J-6B and the resulting AI-generated texts☆53Updated 4 years ago
- MILES is a multilingual text simplifier inspired by LSBert - A BERT-based lexical simplification approach proposed in 2018. Unlike LSBert…☆49Updated 4 years ago
- A utility for labeling clusters of text data.☆28Updated 3 years ago
- A corpus of Python programs annotated with contracts☆22Updated 2 years ago
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.☆52Updated 4 years ago
- Question Generation - Question Answering for Automatic Flashcards☆66Updated 3 years ago
- Podium: a framework agnostic Python NLP library for data loading and preprocessing☆60Updated 2 years ago
- A Domain Specific Language (DSL) for building language patterns. These can be later compiled into spaCy patterns, pure regex, or any othe…☆68Updated 2 years ago
- Confection: the sweetest config system for Python☆187Updated 3 months ago
- With a little help from deep learning, now you too can create your own happy accidents☆65Updated 2 years ago
- CLI tool for comparing images☆35Updated 2 years ago
- PyNLP Lib is an open source Python NLP library that provides functionality for both web and local development☆50Updated 2 years ago
- Weird A.I. Yankovic neural-net based lyrics parody generator☆84Updated 3 years ago
- ☆18Updated 3 years ago
- Natural Language Inflection in English☆11Updated 3 years ago
- Markdown template for Dataseets for Datasets☆63Updated 3 years ago
- Abydos NLP/IR library for Python☆187Updated 2 years ago
- Python wrapper for Ferret☆41Updated 3 years ago
- A minimal Python kernel so you can run Python in your Python☆39Updated 3 years ago
- spaCy match and replace, maintaining conjugation☆35Updated 2 years ago
- ☆70Updated 2 years ago
- Fastlaw's purpose is to replace generic word embeddings for work on supervised machine learning NLP-tasks with legal texts.☆38Updated 6 years ago
- NoPdb: Non-interactive Python Debugger☆84Updated 3 years ago
- A python module for word inflections designed for use with spaCy.☆92Updated 5 years ago
- Scansion tool for Spanish texts☆12Updated last year
- Our open source implementation of MiniLMv2 (https://aclanthology.org/2021.findings-acl.188)☆61Updated 2 years ago
- Efficiently computing & storing token n-grams from large corpora☆24Updated 9 months ago