parkr/near-dup-detection

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/parkr/near-dup-detection)

parkr / near-dup-detection

Near-Duplicate Detection in Python.

☆25

Alternatives and similar repositories for near-dup-detection

Users that are interested in near-dup-detection are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

haiquangtran / RecommenderSVD
View on GitHub
Recommender system that implements Simon Funk's iterative and approximation of Singular Value Decomposition made popular from the Netflix…
☆10Nov 18, 2015Updated 10 years ago
datamicroscopes / lda
View on GitHub
Latent dirichlet allocation (LDA) for datamicroscopes
☆41Oct 16, 2015Updated 10 years ago
felipelouza / egsa
View on GitHub
Generalized enhanced suffix array construction in external memory [CPM'13, AMB 2017]
☆17Aug 9, 2021Updated 4 years ago
kzhai / PyAdaGram
View on GitHub
An Adaptor Grammar model implementation in Python.
☆17Jan 31, 2020Updated 6 years ago
shufo / docker-phoenix
View on GitHub
A script for creating elixir image for phoenix framework.
☆10Dec 15, 2023Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
bgithub1 / cme_open_interest
View on GitHub
ETL project to download and process both CME open interest data, COT data from the CFTC and NAV/shares-outstanding data from various ETF …
☆13Jul 13, 2021Updated 5 years ago
microsoft / QBASHER
View on GitHub
Inverted file indexing and retrieval optimized for short texts. Supports auto-suggest and query segment classification.
☆34Jun 12, 2023Updated 3 years ago
TamedTornado / vn-nlp-libraries
View on GitHub
A collection of NLP libraries based on the work of Lê Hồng Phương, original source http://mim.hus.vnu.edu.vn/phuonglh/.
☆28Sep 13, 2024Updated last year
laurenfklein / QTM340-Fall22
View on GitHub
Notebooks and other course materials for Emory QTM 340 (Fall 2022)
☆12Dec 13, 2022Updated 3 years ago
internaut / tmtoolkit
View on GitHub
Text Mining and Topic Modeling Toolkit for Python with parallel processing power
☆16May 4, 2023Updated 3 years ago
S-P-Quantamental / Natural-Language-Processing-Part-I-Primer
View on GitHub
☆16Feb 25, 2020Updated 6 years ago
aguschin / kaggle
View on GitHub
some useful code for kaggle competitions :)
☆17Mar 19, 2015Updated 11 years ago
pavelchristof / template-scala-cml-sentiment
View on GitHub
Sentiment analysis with PredictionIO and CML
☆12Jul 8, 2015Updated 11 years ago
SymbolixAU / rapidjsonr
View on GitHub
R package exposing the rapidjsonr c++ header-only library
☆16Nov 23, 2025Updated 7 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
civisanalytics / GephiForceDiagramTool
View on GitHub
Command-line tool for building Gephi force-directed graph diagrams.
☆10Nov 10, 2017Updated 8 years ago
UPB-SS1 / PyCrowdTangle
View on GitHub
A Python Wrapper To Retrieve Data From The CrowdTangle API
☆11Mar 26, 2026Updated 3 months ago
VIDA-NYU / memex
View on GitHub
☆13Nov 30, 2015Updated 10 years ago
tilt-dev / vscode-go-autotest
View on GitHub
☆11Dec 7, 2022Updated 3 years ago
RobeDM / ICDM2015
View on GitHub
☆23Nov 15, 2015Updated 10 years ago
akokaz1 / Algo-Trading
View on GitHub
Repository for algorithmic trading ideas
☆10Aug 12, 2021Updated 4 years ago
ContextLab / memory-models-course
View on GitHub
Dartmouth graduate course (PSYC 133) on computational memory models
☆14May 21, 2025Updated last year
ckorzen / icecite
View on GitHub
The repository of Icecite, a research paper management system.
☆15Mar 29, 2018Updated 8 years ago
Strateus / dionis
View on GitHub
Dionis predictors blender
☆10Oct 21, 2015Updated 10 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
OpenAff / geoip-js
View on GitHub
GeoIP Target Redirection and Target Filter Redirector using GeoIP API (JS)
☆22Oct 14, 2020Updated 5 years ago
felipelouza / gsa-is
View on GitHub
Inducing enhanced suffix arrays for string collections [DCC'16, TCS 2017]
☆27Dec 2, 2025Updated 7 months ago
socialcomquant / css_methods_python_workshop
View on GitHub
Materials for the 2023 SOCIAL COMQUANT "Introduction to CSS Methods with Python"
☆15May 2, 2025Updated last year
researchapps / job-maker
View on GitHub
a static web application for generating job submissions scripts for a SLURM cluster
☆20Feb 23, 2023Updated 3 years ago
5harad / cost-of-fairness
View on GitHub
Replication materials for "Algorithmic decision making and the cost of fairness," by Corbett-Davies et al.
☆12Jun 1, 2017Updated 9 years ago
schliebs / rtangle
View on GitHub
R Interface for CrowdTangle Facebook API
☆10Oct 27, 2021Updated 4 years ago
CasAndreu / ldaRobust
View on GitHub
This is a package to implement the Robust Latent Dirichlet Approach in R.
☆10Apr 25, 2019Updated 7 years ago
royceschultz / Generative-Art
View on GitHub
A growing collection of generative art projects
☆14Oct 8, 2019Updated 6 years ago
calavera / crawler
View on GitHub
A distributed image crawler
☆19Feb 9, 2015Updated 11 years ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
teodor-pripoae / protokol
View on GitHub
Protocol buffers for Crystal
☆18May 19, 2019Updated 7 years ago
singhj / locality-sensitive-hashing
View on GitHub
☆27Dec 1, 2015Updated 10 years ago
ContextLab / quail
View on GitHub
A python toolbox for analyzing and plotting free recall data
☆21Dec 11, 2025Updated 7 months ago
yoichi1484 / subspace
View on GitHub
An implementation of "Subspace Representations for Soft Set Operations and Sentence Similarities" (NAACL 2024)
☆10May 31, 2024Updated 2 years ago
hammerlab / seltest
View on GitHub
The simple, fast, visual testing framework for web applications.
☆13Nov 3, 2015Updated 10 years ago
davben / arvig
View on GitHub
An R data package containing georeferenced events of right-wing violence in Germany from 2014 onwards
☆11Jun 27, 2018Updated 8 years ago
DerGuteWolf / lrm-google
View on GitHub
Support for Google Directions API in Leaflet Routing Machine
☆11Jan 9, 2019Updated 7 years ago