clab / wikipedia-parallel-titlesView external linksLinks
Tools for extracting parallel corpora from article titles across languages in Wikipedia
☆74Feb 25, 2015Updated 10 years ago
Alternatives and similar repositories for wikipedia-parallel-titles
Users that are interested in wikipedia-parallel-titles are comparing it to the libraries listed below
Sorting:
- Lab exercises for the DL4MT winter school at DCU☆15Oct 21, 2015Updated 10 years ago
- Will store links to known evaluation datasets alongside stats to characterize them☆24Mar 9, 2016Updated 9 years ago
- ☆44Nov 30, 2017Updated 8 years ago
- Fast structured perceptron sequential labeler☆15Dec 8, 2015Updated 10 years ago
- Neural machine translation implementation using dynet's python bindings☆17Jan 24, 2018Updated 8 years ago
- Efficient Markov Chain word alignment☆52Aug 1, 2021Updated 4 years ago
- Decoding platform for machine translation research☆54Aug 24, 2019Updated 6 years ago
- ☆12Dec 9, 2015Updated 10 years ago
- A repo for sharing language resources related to the outbreak (in machine readable format)☆25Sep 22, 2025Updated 4 months ago
- A little text processing library for Scala.☆28Mar 3, 2016Updated 9 years ago
- C++ implementation of Generalised Brown clustering and python scripts for feature generation☆41Apr 8, 2016Updated 9 years ago
- Cynical data selection☆20Jan 16, 2021Updated 5 years ago
- Fast Word Clustering Software☆79Feb 8, 2025Updated last year
- ☆18Oct 5, 2017Updated 8 years ago
- A High-Quality Multilingual Dataset for Structured Documentation Translation☆37May 1, 2025Updated 9 months ago
- ☆20Aug 17, 2021Updated 4 years ago
- Train bilingual embeddings as described in our NAACL 2015 workshop paper "Bilingual Word Representations with Monolingual Quality in Mind…☆78Jun 15, 2019Updated 6 years ago
- Witwicky: An implementation of Transformer in PyTorch.☆22Aug 17, 2020Updated 5 years ago
- Simple CORPORA list crawler☆10Dec 2, 2016Updated 9 years ago
- Gale&Church (1993) sentence alignment☆16May 9, 2020Updated 5 years ago
- ☆21Dec 9, 2016Updated 9 years ago
- Named Entity Disambiguation for Noisy Text☆66Jun 26, 2017Updated 8 years ago
- Graph-based Dependency Parser☆46Jan 25, 2016Updated 10 years ago
- C++/CUDA toolkit for training sequence and sequence-to-sequence models across multiple GPUs☆187May 15, 2017Updated 8 years ago
- Lazy python recipes.☆10Apr 17, 2021Updated 4 years ago
- Data collection, alignment and TAUS repository☆23Nov 30, 2017Updated 8 years ago
- Codenize your datasources.☆27Dec 1, 2024Updated last year
- BiCVM Code☆45May 14, 2018Updated 7 years ago
- The Berkeley Entity Resolution System jointly solves the problems of named entity recognition, coreference resolution, and entity linking…☆186Dec 7, 2019Updated 6 years ago
- cicada: a hypergraph-based toolkit for statistical machine translation based on {tree, string}-to-{tree, string} models☆42Aug 9, 2021Updated 4 years ago
- Multilingual image description☆45Feb 9, 2018Updated 8 years ago
- Official code and data of "3AM: An Ambiguity-Aware Multi-Modal Machine Translation Dataset"☆12Dec 8, 2024Updated last year
- Latent-variable Synchronous Context-Free Grammar Toolkit☆10Sep 30, 2014Updated 11 years ago
- Crawling engine that crawls a set of top-level domains looking for documents in a list of languages☆11Feb 6, 2024Updated 2 years ago
- Example project showing how you can use your fast.ai based scripts to let Amazon SageMaker perform the training and hosting of your model…☆14Feb 20, 2019Updated 6 years ago
- Efficient Low-Memory Aligner☆146Jan 15, 2025Updated last year
- A BiRNN framework implemented in Python and TensorFlow to extract parallel sentences from aligned comparable corpora.☆33Sep 4, 2018Updated 7 years ago
- A list of Neural MT implementations☆365Jul 27, 2022Updated 3 years ago
- ☆13Aug 20, 2021Updated 4 years ago