jerry2yu / ngramsLinks
A package in C++ for character or word ngram analysis. It uses Ternary Search Tree instead of hashing table for faster ngram frequency counting. Words are converted to unique IDs and encoded to more compact base 256 integers. It is a partial implementation of Dr. Vlado Keselj 's Text-Ngrams 1.6, which is a very flexible Ngram package in perl.
☆20Updated 10 years ago
Alternatives and similar repositories for ngrams
Users that are interested in ngrams are comparing it to the libraries listed below
Sorting:
- Extractors whose input is a chunked sentence. Includes Relnoun, Nesty, and a scala interface for ReVerb.☆28Updated 7 years ago
- Entity Linking in Queries: Efficiency vs. Effectiveness☆18Updated 7 years ago
- ☆21Updated 8 years ago
- This repository contains source code to binarize any real-value word embeddings into binary vectors.☆47Updated 4 years ago
- Entity Linking in Queries: Tasks and Evaluation☆33Updated 2 years ago
- Semantic embeddings of entities☆66Updated 9 years ago
- A tool for classifying mistakes in the output of parsers☆41Updated 2 years ago
- C++ implementation of the Hellinger PCA for computing word embeddings.☆32Updated 8 years ago
- Zero-Shot Open Entity Typing as Type-Compatible Grounding, EMNLP'18.☆42Updated 5 years ago
- CS224S Course Project☆14Updated 11 years ago
- Morfessor FlatCat☆13Updated 6 years ago
- Simple Structured Perceptron tagger in Python☆10Updated 8 years ago
- Dynamic Entity Summarization (DynES)☆20Updated 6 years ago
- Extractive and Compressive Neural Summarization Based on Summary State Representations (NAACL 2019)☆16Updated 5 years ago
- RhetoricalRecursiveNeuralNetwork(R2N2) is recursive neural network using RST for NLP Tasks such as Sentiment Analysis☆12Updated 10 years ago
- ☆20Updated 6 years ago
- cicada: a hypergraph-based toolkit for statistical machine translation based on {tree, string}-to-{tree, string} models☆42Updated 4 years ago
- word2vec++ is a Distributed Representations of Words (word2vec) library and tools implementation, written in C++11 from the scratch☆141Updated last year
- Standalone Neural Ranking Model (SNRM)☆76Updated 6 years ago
- Tools for working with the TREC CAR dataset.☆35Updated 2 months ago
- Neural Reranking for Named Entity Recognition, accepted as regular paper at RANLP 2017☆23Updated 8 years ago
- WordRank: Learning Word Embeddings via Robust Ranking☆51Updated 7 years ago
- Reproducibility of the TAGME entity linking system☆60Updated 6 years ago
- ☆47Updated 8 years ago
- The implementation of 'Effective Document Labeling with Very Few Seed Words: A Topic Modeling Approach', Chenliang Li, Jian Xing, Aixin S…☆17Updated 2 years ago
- Modularizing Unsupervised Sense Embedding☆29Updated 7 years ago
- 📄Neural Sentential Paraphrase Generation to Augment Chatbot Training Dataset☆21Updated 2 years ago
- Implementation of the algorithm described in "Multi-sentence compression: Finding shortest paths in word graphs" by Katja Filippova.☆12Updated 10 years ago
- ☆14Updated 8 years ago
- Word embedding approach based on a dynamic log-linear model☆54Updated 8 years ago