domerin0 / opensubtitles-parser
download, extract, parse and tokenize the opensubtitles dataset with this script
☆44Updated 7 years ago
Alternatives and similar repositories for opensubtitles-parser:
Users that are interested in opensubtitles-parser are comparing it to the libraries listed below
- A repository linking to publicly available dialog datasets. Feel free to send pull requests.☆67Updated 3 years ago
- NAACL 2019 paper: Density Matching for Bilingual Word Embedding (Zhou et al., 2019)☆63Updated 2 years ago
- Large scale sentential paraphrases collection and annotation☆46Updated 2 years ago
- Resources for the OpenNMT hackathon☆51Updated 5 years ago
- ☆27Updated 6 years ago
- ☆88Updated 8 years ago
- DSTC6: End-to-End Conversation Modeling Track☆56Updated 7 years ago
- Baseline models, training scripts, and instructions on how to reproduce our results for our state-of-art grammar correction system from M…☆72Updated 5 years ago
- Large corpus of uncompressed and compressed sentences from news articles.☆123Updated 7 years ago
- Lexically constrained decoding for sequence generation using Grid Beam Search☆91Updated 6 years ago
- Assessing syntactic abilities of BERT☆148Updated 5 years ago
- Dynamic evaluation for pytorch language models, now includes hyperparameter tuning☆104Updated 7 years ago
- ☆56Updated 6 years ago
- This repository makes the integral Let's Go dataset publicly available.☆45Updated last year
- Parsing Reading Predict Network☆96Updated 6 years ago
- Tools for accessing Maluuba's Travel Dialogue Dataset☆75Updated 5 years ago
- ☆54Updated 9 years ago
- An extremely simple Python wrapper for the SRI Language Modeling toolkit☆70Updated 10 years ago
- ☆47Updated 7 years ago
- Reproduction instructions for "Rapid Adaptation of Neural Machine Translation to New Languages"☆40Updated 6 years ago
- Attention-based NMT with a coverage mechanism to indicate whether a source word is translated or not☆111Updated 4 years ago
- Datasets for Question Answering by Search and Reading☆69Updated 7 years ago
- ☆52Updated 7 years ago
- AskUbuntu Question Dataset☆69Updated 8 years ago
- Code for "Query and Output: Generating Words by Querying Distributed Word Representations for Paraphrase Generation" (NAACL 2018)☆92Updated 6 years ago
- Easy to use scripts for evaluating word vectors on a variety of tasks.☆119Updated 4 years ago
- Code for the collection and analysis of the MTNT dataset☆55Updated 6 years ago
- An updated version of the Parser-v1 repo, used for Stanford's submission in the CoNLL17 shared task.☆47Updated 6 years ago
- pair2vec: Compositional Word-Pair Embeddings for Cross-Sentence Inference☆62Updated 2 years ago
- Transition-based UCCA Parser☆72Updated 4 years ago