dcjones / subsampleLinks
Randomly sample lines from massive text files efficiently
☆17Updated 10 years ago
Alternatives and similar repositories for subsample
Users that are interested in subsample are comparing it to the libraries listed below
Sorting:
- ☆23Updated 7 years ago
- A re-implementation of redpony/cdec's tokenize-anything.pl script in python☆8Updated 9 years ago
- Training scripts and recipes for Sockeye Neural Machine Translation toolkit☆37Updated 5 years ago
- C++ implementation of Generalised Brown clustering and python scripts for feature generation☆41Updated 9 years ago
- Pacaya - A Library for Hybrid Graphical Models and Neural Networks☆44Updated 7 years ago
- Fast Word Clustering Software☆78Updated 4 months ago
- Appraise evaluation system for manual evaluation of machine translation output☆75Updated 4 years ago
- ☆17Updated 4 years ago
- Unsupervised parsing and noun phrase identification☆22Updated 11 years ago
- Collection of Evaluation Metrics and Algorithms for Machine Translation☆76Updated 7 years ago
- Yara K-Beam Arc-Eager Dependency Parser☆56Updated 9 years ago
- ☆21Updated 10 years ago
- A simple CoNLL-X to tikz-dependency converter.☆20Updated 12 years ago
- ☆18Updated 7 years ago
- ☆56Updated 6 years ago
- Open-source implementation of the BilBOWA (Bilingual Bag-of-Words without Alignments) word embedding model.☆69Updated 3 years ago
- Neural macine translation soft alignment visualisations for web and command line☆72Updated 3 years ago
- An updated version of the Parser-v1 repo, used for Stanford's submission in the CoNLL17 shared task.☆47Updated 6 years ago
- Decoding platform for machine translation research☆55Updated 5 years ago
- Code accompanying our EMNLP paper Learning Language Representations for Typology Prediction☆71Updated 7 years ago
- Efficient Markov Chain word alignment☆53Updated 3 years ago
- Word sense disambiguation test sets for NMT☆19Updated 4 years ago
- Tool for comparison and evaluation of machine translation.☆56Updated 3 years ago
- Tools for extracting parallel corpora from article titles across languages in Wikipedia☆73Updated 10 years ago
- modlm: A toolkit for mixture of distributions language models☆27Updated 7 years ago
- Graph-based Dependency Parser☆46Updated 9 years ago
- Doing things with embeddings☆64Updated 2 years ago
- Format conversion and graphical representation of [Universal Dependencies](http://universaldependencies.org) trees.☆12Updated 9 months ago
- Code for EMNLP 2016 paper: Morphological Priors for Probabilistic Word Embeddings☆53Updated 8 years ago
- ☆22Updated 5 years ago