FurkanToprak / OkapiBM25
Well-tested implementation of the OkapiBM25 algorithm. Install the npm package!
☆14Updated 2 months ago
Related projects ⓘ
Alternatives and complementary repositories for OkapiBM25
- WebAnnotator is a tool for annotating Web pages. WebAnnotator is implemented as a Firefox extension (https://addons.mozilla.org/en-US/fi…☆48Updated 2 years ago
- email dataset for email signature parsing☆54Updated 8 years ago
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆42Updated 5 years ago
- Hidden alignment conditional random field for classifying string pairs.☆25Updated last month
- Emoji embeddings trained using their emotional content from their online dictionary meanings.☆13Updated 2 years ago
- An index data structure for approximate string search.☆23Updated 5 years ago
- In browser active learning and guided search☆17Updated last year
- Traptor -- A distributed Twitter feed☆26Updated 2 years ago
- Web page segmentation and noise removal☆55Updated 9 months ago
- ☆29Updated 2 years ago
- Using embeddings compressed by Product Quantization, in Javascript☆30Updated last year
- An npm package that allows easy entity searching of Wikidata.☆10Updated 7 years ago
- spaCy on the web☆43Updated last year
- Metadata Extractor & Loader (MEL) ■ The NLP-NER Toolkit (TNNT)☆22Updated last year
- ☆70Updated last year
- KnowledgeStore☆20Updated 6 years ago
- Distance/Similarity functions for Bag of Words, Strings, Vectors and more.☆23Updated last year
- A graph query engine☆10Updated 6 months ago
- Convert a corpus of PDF to clean text files on a distributed architecture☆37Updated 8 months ago
- Multilingual tokenizer that automatically tags each token with its type☆61Updated last year
- 🌐 Netbase : Semantic Graph Database & Wikidata Server☆8Updated last year
- The missing datasets manager. Like hombrew but for datasets. CLI-tool for search and discover datasets!☆41Updated 7 years ago
- Machine Learning and Natural Language Processing of the EEA Corpus via spaCy, Textacy and pyLDAvis and other useful NLP algorithms.☆14Updated last year
- Word analysis, by domain, on the Common Crawl data set for the purpose of finding industry trends☆57Updated 9 months ago
- ☆21Updated 8 years ago
- Dockerfile for audiogrep and pocketsphinx☆11Updated 8 years ago
- A lightweight, standardized library accessing files and datasets, especially tabular ones (CSV, Excel).☆71Updated last year
- Finds linguistic patterns effortlessly☆33Updated last year
- A Cython implementation of the affine gap string distance☆58Updated last year