项目整体分为scrapy-redis分布式爬虫爬取数据、基于ElasticSearch数据检索和前端界面展示三大模块。做此项目是为了熟悉scrapy-redis的基本流程,以及其背后的原理,同时熟悉ElasticSearch的使用。本项目可以作为一个基于ES存储的简单但是相对全面的全栈开发的Demo。项目中所采用的组件均在win10本地环境搭建(伪分布),旨在演示项目流程。你可以参考该项目,并将其扩展到多个主机上,实现分布式ES以及分布式Scrapy。
☆25Dec 8, 2022Updated 3 years ago
Alternatives and similar repositories for JobNews-ElasticSearch-Scrapy_redis
Users that are interested in JobNews-ElasticSearch-Scrapy_redis are comparing it to the libraries listed below
Sorting:
- 基于Scrapy+Elasticsearch+Django搭建的分布式电影搜索☆31Jul 25, 2018Updated 7 years ago
- 基于Scrapy-Redis框架与Mongodb的分布式爬虫-elasticsearch搜索引擎打造☆18Apr 21, 2020Updated 5 years ago
- python搭建搜索引擎☆30May 5, 2022Updated 3 years ago
- scrapy抓取,mysql储存,django展示☆12Feb 6, 2016Updated 10 years ago
- 基于谷歌大规模网页去重simhash算法,对海量文章(长文本)进行去重。☆11Dec 8, 2022Updated 3 years ago
- 基于SG2300X的视频检索【使用自然语言搜索视频内容,定位到符合描述的具体时间段】☆13Feb 29, 2024Updated 2 years ago
- Springboot + ElasticSearch 构建博客检索系统☆12Mar 5, 2020Updated 6 years ago
- Scrapy, tianya, 天涯; scrapy django增量抓取天涯莲蓬鬼话全部帖子☆21Mar 20, 2025Updated last year
- Enable to share the same TCP port for different applications, for example, http and ssh.☆11Oct 2, 2020Updated 5 years ago
- ☆12May 3, 2024Updated last year
- 餐厅管理系统 - 练习JDBC、MySQL数据库、德鲁伊连接池的使用;用户登录、订座、点餐、结账、人事管理☆12Feb 22, 2022Updated 4 years ago
- 基于simhash的文本去重算法☆20Jun 18, 2021Updated 4 years ago
- 静态站 用vue-element-admin框架搭建☆12Dec 4, 2018Updated 7 years ago
- 一个简易的正则表达式引擎!☆10Apr 9, 2017Updated 8 years ago
- 网站监控☆11Nov 9, 2019Updated 6 years ago
- some example plots that are maybe useful?☆11Feb 10, 2026Updated last month
- 批量下载抖音用户视频☆20Jan 19, 2024Updated 2 years ago
- 主要使用python+Scrapy框架去抓取新闻网站☆25Mar 2, 2017Updated 9 years ago
- ☆10Dec 23, 2020Updated 5 years ago
- 长文本相似度模型☆21Nov 24, 2023Updated 2 years ago
- 毕业设计:《基于CLIP模型的视频文本检索设计与实现》☆18Jul 21, 2024Updated last year
- MLflow App Using React, Hooks, RabbitMQ, FastAPI Server, Celery, Microservices☆11Sep 25, 2022Updated 3 years ago
- A Cross-Platform Lightweight 2D Tank Multiplayer Game in Python 2/3☆10Oct 24, 2020Updated 5 years ago
- Leveraging IBM DB2’s Federation Capabilities to Perform SQL Analytics on a Sample Blockchain Insurance Application using Hyperledger Fabr…☆12Sep 17, 2025Updated 6 months ago
- CPU simulation framework for CS520 (Binghamton University, Graduate Computer Architecture)☆10May 10, 2018Updated 7 years ago
- feapder的管道扩展☆16Mar 6, 2023Updated 3 years ago
- ⌨️ RISC-V NS16550A UART driver☆11Mar 24, 2021Updated 4 years ago
- 一个基于分布式爬虫的信安文章搜索引擎☆27May 22, 2023Updated 2 years ago
- UNIX like OS☆17Mar 7, 2020Updated 6 years ago
- SSM_CRUD(ssmcrud),基于ssm+bootstrap的简单员工管理系统。SSM项目☆31Sep 12, 2025Updated 6 months ago
- risc-v OS inspired by xv6☆15Aug 24, 2023Updated 2 years ago
- Silent is very lightweight, high quality - low latency voice chat for gaming.☆13Dec 8, 2021Updated 4 years ago
- 我的导航学习笔记,内容涵盖导航定位开源程序的源码解读 ( 包括:RTKLIB、GAMP、SoftGNSS、KF-GINS、ORB-SLAM3 等)、各种导航设备的使用方式、书籍讲义、博客翻译、开源项目梳理、常用网站记录、Linux/Vim/Git/ROS/VSCode 常用…☆16Mar 20, 2024Updated last year
- 一本关于高考志愿填报的书☆13Jun 12, 2019Updated 6 years ago
- demo natural language video db using CLIP☆28Aug 7, 2024Updated last year
- A scrapy pipeline which send items to Elastic Search server☆98Jan 2, 2018Updated 8 years ago
- ☆15Aug 1, 2021Updated 4 years ago
- Drag Captcha☆20May 28, 2021Updated 4 years ago
- 煎蛋爬虫 爬取无聊图 telegram bot☆16Dec 5, 2025Updated 3 months ago