Locality-sensitive hashing algorithm for text similarity comparisons
☆59Apr 9, 2025Updated 10 months ago
Alternatives and similar repositories for py-nilsimsa
Users that are interested in py-nilsimsa are comparing it to the libraries listed below
Sorting:
- A project for clustering text streams using locality-sensitive hashing (LSH) in Python☆26Sep 23, 2011Updated 14 years ago
- common data interchange format for document processing pipelines that apply natural language processing tools to large streams of text☆35Sep 30, 2016Updated 9 years ago
- An autoscaling python script for Heroku☆27May 16, 2012Updated 13 years ago
- run multiple shell commands in parallel and coordinate their output☆31Jul 5, 2012Updated 13 years ago
- Distributed Web Crawler, Parser and Search Engine.☆10Jun 16, 2016Updated 9 years ago
- stav text annotation visualiser☆34Nov 2, 2011Updated 14 years ago
- Small tools to assist with using Large Language Models☆12Nov 7, 2023Updated 2 years ago
- ☆15Dec 26, 2021Updated 4 years ago
- Initial SKSE Plugin to expose OpenVR to Papyrus, feel free to make suggestions and open issues. BEWARE this is not a stable release, once…☆11Dec 14, 2018Updated 7 years ago
- A Text Comprehension Engine in Python☆15Aug 23, 2015Updated 10 years ago
- Official repository of "Efficient and Effective Query Expansion for Web Search", Short Paper @ CIKM 2018☆15Nov 17, 2019Updated 6 years ago
- Genyris presents a paradigm in which objects can belong to multiple classes independent from construction allowing data to be classified …☆17Nov 16, 2025Updated 3 months ago
- ReconNER, Debug annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality of your data.☆35Jul 26, 2020Updated 5 years ago
- Tweets annotated with coarse-grained sense labels (supersenses)☆13Jun 13, 2014Updated 11 years ago
- Knowledge-based Semantic Role Labeling☆16Jan 31, 2025Updated last year
- Python library for parsing eLife article XML data.☆15Jan 28, 2026Updated last month
- Topic Model or LDA in Cython☆21Apr 9, 2011Updated 14 years ago
- Data and code for the experiments in the Outlier Detection task proposed by Camacho-Collados et al.☆13Aug 28, 2018Updated 7 years ago
- ☆18Jun 12, 2023Updated 2 years ago
- A friendlier interface to `socket`.☆14Apr 11, 2015Updated 10 years ago
- TextFlows is an open-source online platform for composition, execution, and sharing of interactive text mining and natural language proce…☆19Dec 1, 2017Updated 8 years ago
- Building Event Extraction and Trending Framework for Twitter☆14Sep 13, 2017Updated 8 years ago
- A virtual PDF analysis framework☆17Jan 31, 2014Updated 12 years ago
- Statistical Natural Language Processing with Annotated Suffix Trees☆22Jul 22, 2016Updated 9 years ago
- An open source virus scan aggregation framework.☆25Apr 25, 2014Updated 11 years ago
- Code for Fast Information-theoretic Bayesian Optimisation☆16Jun 7, 2018Updated 7 years ago
- A streaming cross-cat inference engine☆20Mar 27, 2024Updated last year
- A python module provides content extraction and summarization of a web page even if the web page was broken.☆18Apr 14, 2023Updated 2 years ago
- A simple and fast search engine☆70Jun 21, 2022Updated 3 years ago
- Apache Nutch extensions☆34Mar 21, 2022Updated 3 years ago
- Slinky, a high-performance web crawler / text analytics in Python, Redis, Hadoop, R, Gephi☆41Aug 30, 2010Updated 15 years ago
- ☆18Apr 25, 2018Updated 7 years ago
- Transform unstructured document collections to structured Linked Data☆29Sep 12, 2025Updated 5 months ago
- IoC's, PCRE's, YARA's etc☆23Mar 25, 2025Updated 11 months ago
- A DSL to build Lucene text queries in Python.☆38Jan 5, 2017Updated 9 years ago
- python3 package supporting efficient storage and querying of sets of sets using the trie data structure. Supports finding all the superse…☆23Sep 15, 2023Updated 2 years ago
- Code for KDD 2014 paper "Mining Topics in Documents: Standing on the Shoulders of Big Data"☆21Oct 6, 2015Updated 10 years ago
- Website for standardized execution and evaluation of algorithms on datasets.☆36Nov 14, 2019Updated 6 years ago
- Starting point for a new Node.js Restful Api☆19Feb 19, 2019Updated 7 years ago