jerry2yu / ngramsLinks
A package in C++ for character or word ngram analysis. It uses Ternary Search Tree instead of hashing table for faster ngram frequency counting. Words are converted to unique IDs and encoded to more compact base 256 integers. It is a partial implementation of Dr. Vlado Keselj 's Text-Ngrams 1.6, which is a very flexible Ngram package in perl.
☆20Updated 10 years ago
Alternatives and similar repositories for ngrams
Users that are interested in ngrams are comparing it to the libraries listed below
Sorting:
- Tree-Structured, First- and Higher-Order Linear Chain, and Semi-Markov CRFs☆45Updated 6 years ago
- Dynamic Entity Summarization (DynES)☆20Updated 6 years ago
- WordRank: Learning Word Embeddings via Robust Ranking☆51Updated 7 years ago
- C++ implementation of the Hellinger PCA for computing word embeddings.☆32Updated 9 years ago
- ☆21Updated 8 years ago
- Entity Linking in Queries: Tasks and Evaluation☆33Updated 2 years ago
- Context Encoders (ConEc) as a simple but powerful extension of the word2vec model for learning word embeddings☆20Updated 5 years ago
- Resources for the Tutorial on "Utilizing Knowledge Bases in Text-centric Information Retrieval"☆25Updated 9 years ago
- Utilities for manipulating finite state transducers with the OpenFst library.☆32Updated 8 years ago
- Extractive and Compressive Neural Summarization Based on Summary State Representations (NAACL 2019)☆16Updated 5 years ago
- Extractors whose input is a chunked sentence. Includes Relnoun, Nesty, and a scala interface for ReVerb.☆28Updated 8 years ago
- My most frequently used learning-to-rank algorithms ported to rust for efficiency. Try it: "pip install fastrank".☆52Updated 8 months ago
- Word embedding approach based on a dynamic log-linear model☆55Updated 8 years ago
- utility class for building/evaluating document representations☆53Updated 5 years ago
- CS224S Course Project☆14Updated 11 years ago
- Morfessor FlatCat☆13Updated 6 years ago
- Zero-Shot Open Entity Typing as Type-Compatible Grounding, EMNLP'18.☆42Updated 5 years ago
- ☆25Updated 2 years ago
- Named Entity Recognition (NER) models (neural and sparse) implemented based on package LibN3L☆19Updated 8 years ago
- Improving the effectiveness Lucene's BM25 (and testing it using Yahoo! Answers and Stack Overflow collections)☆16Updated 3 years ago
- Entity Linking in Queries: Efficiency vs. Effectiveness☆18Updated 8 years ago
- Hierarchical word clustering, following "Brown clustering" (Brown et al., 1992)☆70Updated 10 years ago
- Standalone Neural Ranking Model (SNRM)☆76Updated 6 years ago
- Hacky implementation of ppjoin by Chuan Xia et Al☆19Updated 11 years ago
- Experiment with document similarity via Matt Kusner's MWD paper☆24Updated 9 years ago
- Semantic embeddings of entities☆66Updated 9 years ago
- An entity linking prototype, developed using the datasets from the TAC-KBP sub-task.☆28Updated 8 years ago
- Lightweight C++ translator for OpenNMT Torch models (deprecated)☆81Updated 5 years ago
- Tools relating to the CC-News-En Collection☆20Updated last year
- The dataset and statistical analysis code released with the submission of EMNLP 2017 paper "Why We Need New Evaluation Metrics for NLG"☆19Updated 4 years ago