基于谷歌大规模网页去重simhash算法,对海量文章(长文本)进行去重。
☆11Dec 8, 2022Updated 3 years ago
Alternatives and similar repositories for Simhash-
Users that are interested in Simhash- are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 基于simhash的文本去重算法☆20Jun 18, 2021Updated 4 years ago
- 使用Simhash对海量文本进行去重☆12Jun 2, 2018Updated 7 years ago
- A data query GUI software using PyQt5☆10May 20, 2023Updated 2 years ago
- Tool to find document on the photo and save it to pdf.☆10Jul 16, 2023Updated 2 years ago
- GUI for pg_timetable☆15Feb 27, 2026Updated last month
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- master-data-management system☆12Jan 7, 2023Updated 3 years ago
- A vue component of an SQL Editor based on CodeMirror, with a custom auto-completion☆11Jul 29, 2018Updated 7 years ago
- 数据资产管理☆10Dec 24, 2018Updated 7 years ago
- LAVIS - A One-stop Library for Language-Vision Intelligence☆10Apr 18, 2023Updated 2 years ago
- ☆10Apr 8, 2018Updated 7 years ago
- A sample installer "per machine" for excelDNA addins☆12Nov 8, 2017Updated 8 years ago
- 企业微信接收/回复消息sdk☆16Oct 30, 2020Updated 5 years ago
- 基于SG2300X的视频检索【使用自然语言搜索视频内容,定位到符合描述的具体时间段】☆13Feb 29, 2024Updated 2 years ago
- Springboot + ElasticSearch 构建博客检索系统☆12Mar 5, 2020Updated 6 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Quandl Excel Addin for Windows☆13Dec 1, 2021Updated 4 years ago
- a simple and powerful crontab written in golang with web page management. golang实现的简单便捷的计划任务管理系统, 自带 web 界面,方便的管理多个任务. 支持 秒,分,时,日,月,周☆12Mar 14, 2021Updated 5 years ago
- Chrome Extension demo for a tutorial☆12Mar 5, 2023Updated 3 years ago
- 辅助团 队蓝牙室内定位项目实现的计步器☆11Jan 10, 2017Updated 9 years ago
- a pure python Aho-corasick algorithm implementation☆24Mar 17, 2014Updated 12 years ago
- 收集完成的tensorflow实例,使用图片分类模式训练并使用图片识别,支持控制台模式和B/S模式。☆12Jul 31, 2017Updated 8 years ago
- A fast AES encryption/decryption library for data security☆13Aug 10, 2025Updated 7 months ago
- Capture photos, convert to pdf, (ocr) text recognition with tesseract, share etc (SwiftUI, Combine, Tesseract)☆14Mar 14, 2021Updated 5 years ago
- autocomplete with redis☆15Dec 5, 2013Updated 12 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- 实现功能:新输入一段文本,与已有数据进行相似度进行比较,返回TOP10的文本。主要实现方法:jieba中文分词、gensim、TF-IDF词汇重要性、cosine余弦相似度。☆11Jul 30, 2020Updated 5 years ago
- Boolean evaluation and digital calculation expression engine for Java☆12Apr 18, 2022Updated 3 years ago
- python3操作企业微信,发送文字、图片、语音、视频、文件,支持命令行方式调用,其他类引用。☆13Apr 4, 2019Updated 6 years ago
- 简单易用的数据同步导出框架☆11Mar 12, 2026Updated 2 weeks ago
- Price Spider is a Python tool to get price & promotion from JD, Tmall, Amazon, BeiBei☆10Jun 14, 2019Updated 6 years ago
- ☆12May 3, 2024Updated last year
- [NeurIPS 2025] Bag of Tricks for Inference-time Computation of LLM Reasoning☆16Sep 20, 2025Updated 6 months ago
- 本项目包含几种常用 NLP算法的实现:关键词(keyword)、命名实体(named entity)、自动摘要(abstract)、文本相似度比较(text similarity)等☆16Jan 16, 2022Updated 4 years ago
- Open source OKR application☆14Mar 19, 2026Updated last week
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Performing Latent Semantic Analysis with Python on large datasets.☆13Jun 21, 2022Updated 3 years ago
- 王者荣耀上号器☆12Jun 28, 2023Updated 2 years ago
- A demo of asynchronous generation of static html pages using Django 3.0 + Celery 4.4 + Redis 3.3.☆15Jan 6, 2022Updated 4 years ago
- 一个基于elasticsearch开发的搜索引擎网站☆14Nov 22, 2022Updated 3 years ago
- The Excel-DNA Add-In Manager makes it ease to distribute, install and update Excel add-ins.☆15Mar 19, 2024Updated 2 years ago
- 98堂 色花堂一键推送磁力链接下载客户端☆31Jul 18, 2024Updated last year
- 元数据采集,抓取指定目标库的所有表信息☆12Sep 8, 2022Updated 3 years ago