Tools for extracting parallel corpora from article titles across languages in Wikipedia
☆74Feb 25, 2015Updated 11 years ago
Alternatives and similar repositories for wikipedia-parallel-titles
Users that are interested in wikipedia-parallel-titles are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Lab exercises for the DL4MT winter school at DCU☆15Oct 21, 2015Updated 10 years ago
- Cynical data selection☆20Jan 16, 2021Updated 5 years ago
- Neural machine translation implementation using dynet's python bindings☆17Jan 24, 2018Updated 8 years ago
- Decoding platform for machine translation research☆54Aug 24, 2019Updated 6 years ago
- ☆18Oct 5, 2017Updated 8 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- A repo for sharing language resources related to the outbreak (in machine readable format)☆25Sep 22, 2025Updated 6 months ago
- C++ implementation of Generalised Brown clustering and python scripts for feature generation☆41Apr 8, 2016Updated 10 years ago
- Fast structured perceptron sequential labeler☆15Dec 8, 2015Updated 10 years ago
- Efficient Markov Chain word alignment☆53Aug 1, 2021Updated 4 years ago
- Dynamic data selection for neural machine translation☆20Jan 28, 2018Updated 8 years ago
- BiCVM Code☆45May 14, 2018Updated 7 years ago
- Will store links to known evaluation datasets alongside stats to characterize them☆24Mar 9, 2016Updated 10 years ago
- ☆44Nov 30, 2017Updated 8 years ago
- Graph-based Dependency Parser☆46Jan 25, 2016Updated 10 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Witwicky: An implementation of Transformer in PyTorch.☆22Aug 17, 2020Updated 5 years ago
- ☆12Dec 9, 2015Updated 10 years ago
- Gale&Church (1993) sentence alignment☆16May 9, 2020Updated 5 years ago
- This repository contains additional reference translations for the WMT'14 En-De (newstest2014) and WMT'19 En-Ru (newstest2019) test sets …☆15Aug 31, 2021Updated 4 years ago
- Appraise evaluation system for manual evaluation of machine translation output☆77May 7, 2021Updated 4 years ago
- Latent-variable Synchronous Context-Free Grammar Toolkit☆10Sep 30, 2014Updated 11 years ago
- A list of Neural MT implementations☆364Jul 27, 2022Updated 3 years ago
- C++/CUDA toolkit for training sequence and sequence-to-sequence models across multiple GPUs☆185May 15, 2017Updated 8 years ago
- The Berkeley Entity Resolution System jointly solves the problems of named entity recognition, coreference resolution, and entity linking…☆188Dec 7, 2019Updated 6 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆21Dec 9, 2016Updated 9 years ago
- Simple CORPORA list crawler☆10Dec 2, 2016Updated 9 years ago
- Fast Word Clustering Software☆79Feb 8, 2025Updated last year
- Lazy python recipes.☆10Updated this week
- A High-Quality Multilingual Dataset for Structured Documentation Translation☆37May 1, 2025Updated 11 months ago
- ☆34Nov 29, 2016Updated 9 years ago
- ☆20Aug 17, 2021Updated 4 years ago
- Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.☆160Jun 18, 2024Updated last year
- A tool for extracting plain text from Wikipedia dumps☆15Sep 13, 2018Updated 7 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Open-Source Neural Machine Translation in Tensorflow☆803Dec 9, 2022Updated 3 years ago
- scripts and configuration files for Edinburgh neural MT submission to WMT 16 shared translation task☆138Nov 5, 2020Updated 5 years ago
- Easy Bootstrap Resampling and Approximate Randomization for BLEU, METEOR, and TER using Multiple Optimizer Runs. This implements "Better …☆205Feb 25, 2023Updated 3 years ago
- Top-Down BTG-based Preordering☆16Jan 14, 2016Updated 10 years ago
- A rule-based machine translation system from Ottoman Turkish to Modern Turkish.☆23Jul 8, 2020Updated 5 years ago
- Multilingual image description☆45Feb 9, 2018Updated 8 years ago
- ☆121Mar 15, 2017Updated 9 years ago