A package in C++ for character or word ngram analysis. It uses Ternary Search Tree instead of hashing table for faster ngram frequency counting. Words are converted to unique IDs and encoded to more compact base 256 integers. It is a partial implementation of Dr. Vlado Keselj 's Text-Ngrams 1.6, which is a very flexible Ngram package in perl.
☆20May 11, 2015Updated 10 years ago
Alternatives and similar repositories for ngrams
Users that are interested in ngrams are comparing it to the libraries listed below
Sorting:
- A Corpus Data Retrieval Index using Lucene for Look-Ups☆19Feb 24, 2026Updated last week
- Automatic Differentiation for OpenCL.☆20Mar 4, 2015Updated 10 years ago
- A framework, data and configs for generating and building Tesseract OCR lang.traineddata model files, specifically for Japanese☆10Dec 9, 2013Updated 12 years ago
- Speech ANDroid Apps☆20Jan 22, 2014Updated 12 years ago
- Lupa for Torch☆10Sep 16, 2015Updated 10 years ago
- Focused Crawler for VT's CTRNet☆10May 13, 2013Updated 12 years ago
- "Save as DAISY" add-in for Microsoft Word☆10Dec 22, 2025Updated 2 months ago
- Madek main web interface☆21Updated this week
- (Labeled) Latent Dirichlet Allocation on a sentence level with Gibbs Sampling☆10Mar 27, 2014Updated 11 years ago
- Redis tcp map for postfix☆12Jun 28, 2024Updated last year
- Grecka is a python script to convert Greek to Greeklish based on ELOT 743☆12Aug 4, 2018Updated 7 years ago
- Browser based post correction tool for Alto XML files☆14Sep 20, 2013Updated 12 years ago
- Matlab based document image analysis and classification system, that makes heavy use of contextual and language cues to decode image glyp…☆12Nov 7, 2011Updated 14 years ago
- A plug-in architecture for extending Siri virtual assistant☆29Mar 30, 2014Updated 11 years ago
- ☆11Sep 10, 2023Updated 2 years ago
- Layers, datasets and utilities for PyTorch☆10Nov 22, 2023Updated 2 years ago
- Repository for UC Santa Cruz's work on Libresoft's CVSAnalY☆15May 13, 2013Updated 12 years ago
- Over-engineered tool for symlinking dotfiles☆37Nov 13, 2013Updated 12 years ago
- Automated svn2git mirror of include-what-you-use: link goes to upstream☆13May 27, 2015Updated 10 years ago
- simple ansible playbook to take clean ubuntu 18.04 to CUDA 10, PyTorch 1.0, fastai, miniconda heaven☆12Dec 16, 2018Updated 7 years ago
- Adium plugin for Tox IM protocol☆14Sep 6, 2014Updated 11 years ago
- C++ FreeVerb implementation in STK☆15Apr 22, 2012Updated 13 years ago
- An EasyMotion plugin for Qt Creator☆11Feb 1, 2016Updated 10 years ago
- Distributed Proofreading of Automatic Segmentations☆15Sep 30, 2022Updated 3 years ago
- BLAS Level 1 operations for ndarrays☆11Jul 30, 2016Updated 9 years ago
- The secure, transparent, auditable, reliable electronic voting system☆14Oct 6, 2016Updated 9 years ago
- This repository contains the code used in a publication 'Active Learning for Decision-Making from Imbalanced Observational Data', Iiris S…☆11May 14, 2019Updated 6 years ago
- Human-friendly query language for Elasticsearch☆23Jun 8, 2021Updated 4 years ago
- A duplicate data detector engine PoC based on Elasticsearch.☆20Apr 3, 2015Updated 10 years ago
- Script to automatically perform zonal OCR on a PDF and rename the PDF according to the results.☆15Jul 24, 2014Updated 11 years ago
- Get your latitude/longitude via wifi access points☆15Sep 25, 2012Updated 13 years ago
- Term List Matching Plugin for ElasticSearch☆26Jan 20, 2014Updated 12 years ago
- Simple HTTP redirector for tmpnb nodes☆12Sep 20, 2017Updated 8 years ago
- Convert tag files (ctags, gccxml, etc) to databases (sqlite, mysql, etc)☆13Mar 30, 2015Updated 10 years ago
- A windows dll call hellper☆14Dec 19, 2014Updated 11 years ago
- An application of stacked denoising autoencoders to multi-modal (images and audio) abstract feature discovery☆12Oct 23, 2013Updated 12 years ago
- Generate word-word similarities from Gensim's latent semantic indexing (Python)☆11Jan 10, 2017Updated 9 years ago
- snf-image is a Ganeti OS definition. It allows Ganeti to launch instances from predefined or untrusted custom Images. The whole process o…☆12Feb 27, 2018Updated 8 years ago
- Brand disambiguator for tweets to differentiate e.g. Orange vs orange (brand vs foodstuff), using NLTK and scikit-learn☆58Jul 11, 2013Updated 12 years ago