simhash算法实现海量内容查重
☆14Apr 23, 2016Updated 10 years ago
Alternatives and similar repositories for check_file_system
Users that are interested in check_file_system are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 使用simhash算法,快速索引和查询大量文本简历☆21Dec 16, 2015Updated 10 years ago
- 使用Simhash对海量文本进行去重☆12Jun 2, 2018Updated 7 years ago
- 海量中文文本快速查重☆18Dec 16, 2018Updated 7 years ago
- 简易TCP/IP协议栈,支持TCP、UDP,支持DHCP动态获取IP,支持keep_alive等☆24Mar 30, 2018Updated 8 years ago
- ☆10Apr 8, 2018Updated 8 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- 社会信息检索作业,实现简单的搜索引擎,计算TFIDF值以及两个句子的相似度☆19Apr 4, 2018Updated 8 years ago
- 收集完成的tensorflow实例,使用图片分类模式训练并使用图片识别,支持控制台模式和B/S模式。☆12Jul 31, 2017Updated 8 years ago
- 基于gensim模块的中文句子相似度计算☆52Aug 1, 2018Updated 7 years ago
- autocomplete with redis☆15Dec 5, 2013Updated 12 years ago
- 基于谷歌大规模网页去重simhash算法,对海量文章(长文本)进行去重。☆11Dec 8, 2022Updated 3 years ago
- Price Spider is a Python tool to get price & promotion from JD, Tmall, Amazon, BeiBei☆10Jun 14, 2019Updated 6 years ago
- 华南理工大学高英实验室进行的分布式爬虫项目,除了实验室内部人员外,不得私自传播.☆21Jul 13, 2014Updated 11 years ago
- A demo of asynchronous generation of static html pages using Django 3.0 + Celery 4.4 + Redis 3.3.☆15Jan 6, 2022Updated 4 years ago
- 一个全网爬的多线程爬虫☆18Dec 2, 2016Updated 9 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- 抖音9.1.1,其他版本没试,device_register接口fiddler抓包密文的部分为显示明文,hook XG☆17Jul 3, 2020Updated 5 years ago
- 推荐系统,web端展示基于django☆12Nov 1, 2017Updated 8 years ago
- Python API for Various DB-Backed Simhash Clusters☆64Mar 16, 2017Updated 9 years ago
- Ouroboros: On Accelerating Training of Transformer-Based Language Models☆10Nov 7, 2019Updated 6 years ago
- ZEGO GoClass 是一款基于 ZEGO 音视频互动服务、即构互动白板服务(ZegoWhiteboard)以 及 ZEGO 云端录制服务, 根据在线教育行业通用场景及需求研发出来的一套可供教育机构直接使用并开展运营的场景方案。☆10Aug 4, 2022Updated 3 years ago
- Requests + futures = <3 - a grequests fork, origonal code: https://github.com/kennethreitz/grequests☆63Jun 18, 2015Updated 10 years ago
- Issuer Identification Number Database and Verification Utility Library. Luhn Algorithm, BIN Checker, Random Credit Card Generators.☆24Jan 13, 2024Updated 2 years ago
- token bucket ratelimiter for nginx-lua/go/gin-middleware☆28Jul 5, 2023Updated 2 years ago
- Open source community's implementation of the model from "LANGUAGE MODEL BEATS DIFFUSION — TOKENIZER IS KEY TO VISUAL GENERATION"☆15Nov 11, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- A method for event correlation detection based on Spatial-Temporal-Textual point process☆13Dec 16, 2019Updated 6 years ago
- python3 pytorch>=0.4☆11Dec 25, 2019Updated 6 years ago
- online-exam-backend是一个在线考试系统的后端模块。 基于Jersey+Spring实现的的restful服务,主要包括用户管理、在线考试,自动批卷、成绩管理、错题管理、留言板、试卷管理、题库管理、试题科目维护等功能。☆11Mar 19, 2021Updated 5 years ago
- A way to turn markdown into HTML and ebooks☆102Sep 26, 2013Updated 12 years ago
- 用TF特征向量和simhash指纹计算中文文本的相似度☆217Aug 12, 2016Updated 9 years ago
- 多源多分类图文数据监控平台设计与实现(Python、Django、爬虫、Echarts等技术)☆19Jul 4, 2018Updated 7 years ago
- ☆15Feb 5, 2022Updated 4 years ago
- 驾校在线考试模拟系统桌面端。科目一、科目四支持语音播报、错题解答等功能,技术栈:一次开发多端适配,web端,可生成desktop安装包,主要使用lectron-builder+vue全家桶以及element-ui☆14Aug 5, 2020Updated 5 years ago
- 文档去重功能是为了解决搜索引擎的文档语义重复的问题,方法是多重哈希下的语义指纹算法。☆11Aug 17, 2013Updated 12 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A ctypes-based python module that provides access to Bob Jenkins' hash function.☆17Dec 3, 2009Updated 16 years ago
- paascloud配套demo☆13May 19, 2018Updated 8 years ago
- SDK:移动端rtmp直播推送,类似于花椒、映客直播推送☆16Feb 22, 2016Updated 10 years ago
- Attempts to prune yolo v3 tiny.☆10Dec 13, 2018Updated 7 years ago
- ☆28Oct 28, 2018Updated 7 years ago
- 这个案例是,学习阿里云--云服务器管理控制台,(以下简称控制台)这个控制台应用,是多个angular项目组合,他通过,不同的二级域名跳转打开不同的项目,比如现在这个项目是ecs.aliyun.com,默认跳转路由是#home,当然这是学习用不是挑衅阿里,因为我看他的前端皮肤…☆15Feb 9, 2017Updated 9 years ago
- Aspose.Words for Java Examples☆18Apr 22, 2024Updated 2 years ago