ZhangYiBo513/Simhash-

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ZhangYiBo513/Simhash-)

ZhangYiBo513 / Simhash-

基于谷歌大规模网页去重simhash算法，对海量文章（长文本）进行去重。

☆11

Alternatives and similar repositories for Simhash-

Users that are interested in Simhash- are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

hiyoung123 / DuplicateRemove
View on GitHub
基于simhash的文本去重算法
☆20Jun 18, 2021Updated 5 years ago
15810856129 / Simhash
View on GitHub
使用Simhash对海量文本进行去重
☆12Jun 2, 2018Updated 8 years ago
FYJNEVERFOLLOWS / MyDataStudio
View on GitHub
A data query GUI software using PyQt5
☆10May 20, 2023Updated 3 years ago
GaagAlex1 / PhotoDoc
View on GitHub
Tool to find document on the photo and save it to pdf.
☆10Jul 16, 2023Updated 3 years ago
mrassili / extension-demo
View on GitHub
Chrome Extension demo for a tutorial
☆12Mar 5, 2023Updated 3 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
cybertec-postgresql / pg_timetable_gui
View on GitHub
GUI for pg_timetable
☆16Apr 13, 2026Updated 3 months ago
devonfw-forge / keywi
View on GitHub
master-data-management system
☆12Jan 7, 2023Updated 3 years ago
duongphuhiep / vue-sqleditor-poc
View on GitHub
A vue component of an SQL Editor based on CodeMirror, with a custom auto-completion
☆11Jul 29, 2018Updated 7 years ago
zhangsaizhaoc / DataAssetManagement
View on GitHub
数据资产管理
☆10Dec 24, 2018Updated 7 years ago
XiPotatonium / LAVIS
View on GitHub
LAVIS - A One-stop Library for Language-Vision Intelligence
☆10Apr 18, 2023Updated 3 years ago
bpatra / ExcelDNAWixInstallerLM
View on GitHub
A sample installer "per machine" for excelDNA addins
☆12Nov 8, 2017Updated 8 years ago
ZillaRU / VideoSearch-tpu
View on GitHub
基于SG2300X的视频检索【使用自然语言搜索视频内容，定位到符合描述的具体时间段】
☆13Feb 29, 2024Updated 2 years ago
willshi2023 / springboot-es
View on GitHub
Springboot + ElasticSearch 构建博客检索系统
☆12Mar 5, 2020Updated 6 years ago
quandl / quandl-excel-windows
View on GitHub
Quandl Excel Addin for Windows
☆14Dec 1, 2021Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
BlackHole1 / wxwork_message_sdk
View on GitHub
企业微信接收/回复消息sdk
☆16Oct 30, 2020Updated 5 years ago
metadata1984 / pyAhocorasick
View on GitHub
a pure python Aho-corasick algorithm implementation
☆24Mar 17, 2014Updated 12 years ago
gohouse / crontab
View on GitHub
a simple and powerful crontab written in golang with web page management. golang实现的简单便捷的计划任务管理系统, 自带 web 界面,方便的管理多个任务. 支持秒,分,时,日,月,周
☆12Mar 14, 2021Updated 5 years ago
zhenghaishu / MachineLearning
View on GitHub
☆10Apr 8, 2018Updated 8 years ago
satori1995 / FastAES
View on GitHub
A fast AES encryption/decryption library for data security
☆13Aug 10, 2025Updated 11 months ago
tauris-io / expression
View on GitHub
Boolean evaluation and digital calculation expression engine for Java
☆12Apr 18, 2022Updated 4 years ago
xiaoshuwen1995 / Text-Similarity-Match
View on GitHub
实现功能：新输入一段文本，与已有数据进行相似度进行比较，返回TOP10的文本。主要实现方法：jieba中文分词、gensim、TF-IDF词汇重要性、cosine余弦相似度。
☆11Jul 30, 2020Updated 5 years ago
talent518 / tensorflow
View on GitHub
收集完成的tensorflow实例，使用图片分类模式训练并使用图片识别，支持控制台模式和B/S模式。
☆12Jul 31, 2017Updated 8 years ago
CrossmodalGroup / ESL
View on GitHub
☆12May 3, 2024Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
RedisLabs / redis-completion
View on GitHub
autocomplete with redis
☆15Dec 5, 2013Updated 12 years ago
Daphnis-z / nlp-ztools
View on GitHub
本项目包含几种常用 NLP算法的实现：关键词(keyword)、命名实体(named entity)、自动摘要(abstract)、文本相似度比较(text similarity)等
☆16Jan 16, 2022Updated 4 years ago
huangantai / QywxPython
View on GitHub
python3操作企业微信，发送文字、图片、语音、视频、文件，支持命令行方式调用，其他类引用。
☆13Apr 4, 2019Updated 7 years ago
TheDataLeek / Python-LSA
View on GitHub
Performing Latent Semantic Analysis with Python on large datasets.
☆13Jun 21, 2022Updated 4 years ago
ChrisLee0211 / FI_Search
View on GitHub
一个基于elasticsearch开发的搜索引擎网站
☆14Nov 22, 2022Updated 3 years ago
Sl0v3C / PriceSpider
View on GitHub
Price Spider is a Python tool to get price & promotion from JD, Tmall, Amazon, BeiBei
☆10Jun 14, 2019Updated 7 years ago
Excel-DNA / AddInManager
View on GitHub
The Excel-DNA Add-In Manager makes it ease to distribute, install and update Excel add-ins.
☆15Mar 19, 2024Updated 2 years ago
mirraico / Pedometer
View on GitHub
辅助团队蓝牙室内定位项目实现的计步器
☆11Jan 10, 2017Updated 9 years ago
yelc66 / 98MagnetDownload
View on GitHub
98堂色花堂一键推送磁力链接下载客户端
☆32Jul 18, 2024Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
poemp / metadata-gather
View on GitHub
元数据采集,抓取指定目标库的所有表信息
☆12Sep 8, 2022Updated 3 years ago
IRoye / sqlEditor
View on GitHub
一个SQL 编辑器的前端界面
☆18Nov 28, 2020Updated 5 years ago
NICE-FUTURE / tfidf-cosine-text-recommendation
View on GitHub
【Demo】对新闻标题使用TF-IDF向量化和cosine相似度计算完成相似标题推荐
☆14Mar 2, 2020Updated 6 years ago
shiyunbo / django-static-page-generator-celery-redis
View on GitHub
A demo of asynchronous generation of static html pages using Django 3.0 + Celery 4.4 + Redis 3.3.
☆15Jan 6, 2022Updated 4 years ago
dartsyms / scanverter
View on GitHub
Capture photos, convert to pdf, (ocr) text recognition with tesseract, share etc (SwiftUI, Combine, Tesseract)
☆14Mar 14, 2021Updated 5 years ago
snower / sevent
View on GitHub
The highest performance event loop.
☆15Feb 12, 2026Updated 5 months ago
graphway / neo4j-algo
View on GitHub
Neo4j图数据库及图算法
☆16Oct 13, 2020Updated 5 years ago