jerry2yu / ngramsLinks
A package in C++ for character or word ngram analysis. It uses Ternary Search Tree instead of hashing table for faster ngram frequency counting. Words are converted to unique IDs and encoded to more compact base 256 integers. It is a partial implementation of Dr. Vlado Keselj 's Text-Ngrams 1.6, which is a very flexible Ngram package in perl.
☆20Updated 10 years ago
Alternatives and similar repositories for ngrams
Users that are interested in ngrams are comparing it to the libraries listed below
Sorting:
- An entity linking prototype, developed using the datasets from the TAC-KBP sub-task.☆28Updated 8 years ago
- Utilities for manipulating finite state transducers with the OpenFst library.☆32Updated 8 years ago
- ☆21Updated 9 years ago
- CS224S Course Project☆14Updated 11 years ago
- Extractive and Compressive Neural Summarization Based on Summary State Representations (NAACL 2019)☆16Updated 5 years ago
- Extractors whose input is a chunked sentence. Includes Relnoun, Nesty, and a scala interface for ReVerb.☆28Updated 8 years ago
- Dynamic Entity Summarization (DynES)☆20Updated 6 years ago
- SMOR (Stuttgart Morphology) with alternative lemmatization component☆13Updated 2 years ago
- Morfessor FlatCat☆13Updated 6 years ago
- Deep learning model of machine translation using attentional and structural biases☆13Updated 8 years ago
- A simple Python wrapper for the ClearNLP constituents-to-dependencies converter☆10Updated 10 years ago
- ☆19Updated 7 years ago
- C++ implementation of a part-of-speech (POS) tagger using the lookahead tagging algorithm.☆12Updated 6 years ago
- Simple Structured Perceptron tagger in Python☆10Updated 8 years ago
- A tool for classifying mistakes in the output of parsers☆41Updated 2 years ago
- Easy-first dependency parser based on Hierarchical Tree LSTMs☆32Updated 9 years ago
- Twpipe is a pipeline toolkit that parses raw tweets into universal dependencies.☆28Updated 6 years ago
- Corpus preprocessing☆99Updated last year
- Keras implementation of ontology aware token embeddings☆49Updated 7 years ago
- Open-source tools for morphological tagging, segmentation and stemming.☆40Updated 6 years ago
- ☆25Updated 2 years ago
- ☆47Updated 8 years ago
- WordRank: Learning Word Embeddings via Robust Ranking☆51Updated 7 years ago
- Entity Linking in Queries: Efficiency vs. Effectiveness☆18Updated 8 years ago
- cicada: a hypergraph-based toolkit for statistical machine translation based on {tree, string}-to-{tree, string} models☆42Updated 4 years ago
- ☆10Updated 7 years ago
- Neural Reranking for Named Entity Recognition, accepted as regular paper at RANLP 2017☆23Updated 8 years ago
- A multilingual, multi-style and multi-granularity dataset for cross-language textual similarity detection☆61Updated 8 years ago
- Standalone Neural Ranking Model (SNRM)☆76Updated 7 years ago
- Implicit relation extractor using a natural language model.☆24Updated 7 years ago