基于Scrapy-Redis框架与Mongodb的分布式爬虫-elasticsearch搜索引擎打造
☆18Apr 21, 2020Updated 5 years ago
Alternatives and similar repositories for Scrapy_spider
Users that are interested in Scrapy_spider are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 🕷️ [Graduation Project] Scrapy-Redis distributed crawler + Elasticsearch search engine + Django full-stack application; 论文搜索引擎(含Scrapy-R…☆44Feb 18, 2023Updated 3 years ago
- Crawling zhihu, jobbole, lagou by Scrapy, and using Elasticsearch+Django to build a Search Engine website --- README_zh.md (including: i…☆40Aug 23, 2018Updated 7 years ago
- 项目整体分为scrapy-redis分布式爬虫爬取数据、基于ElasticSearch数据检索和前端界面展示三大模块。做此项目是为了熟悉scrapy-redis的基本流程,以及其背后的原理,同时熟悉ElasticSearch的使用。本项目可以作为一个基于ES存储的简单但是相…☆25Dec 8, 2022Updated 3 years ago
- 基于谷歌大规模网页去重simhash算法,对海量文章(长文本)进行去重。☆11Dec 8, 2022Updated 3 years ago
- 基于spring-security的微服务鉴权中心☆14Nov 9, 2022Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- elasticsearch7.9 cdh-ext-parcels and single machine multi instance☆10Jul 12, 2021Updated 4 years ago
- 猫眼电影评论爬虫,给出猫眼电影id即可。☆13Dec 19, 2019Updated 6 years ago
- 慕课网-Python Flask构建可扩展的RESTful API-笔记☆13Jun 16, 2018Updated 7 years ago
- 实现功能:新输入一段文本,与已有数据进行相似度进行比较,返回TOP10的文本。主要实现方法:jieba中文分词、gensim、TF-IDF词汇重要性、cosine余弦相似度。☆11Jul 30, 2020Updated 5 years ago
- 本项目包含几种常用 NLP算法的实现:关键词(keyword)、命名实体(named entity)、自动摘要(abstract)、文本相似度比较(text similarity)等☆16Jan 16, 2022Updated 4 years ago
- 基于Python3实现的js加密反爬,验证码破解,字体加密反爬等其他类型反爬虫的破解☆15Jun 9, 2023Updated 2 years ago
- 一个基于elasticsearch开发的搜索引擎网站☆14Nov 22, 2022Updated 3 years ago
- Scrapy框架,抓取商品信息(已爬70w+数据)☆21Aug 31, 2018Updated 7 years ago
- 【Demo】对新闻标题使用TF-IDF向量化和cosine相似度计算完成相似标题推荐☆14Mar 2, 2020Updated 6 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- 基于simhash的文本去重算法☆20Jun 18, 2021Updated 4 years ago
- pinduoduo_spider☆22Feb 28, 2019Updated 7 years ago
- Code for experiments on self-prediction as a way to measure introspection in LLMs☆16Dec 10, 2024Updated last year
- ☆21Jan 9, 2023Updated 3 years ago
- 长文本相似度模型☆21Nov 24, 2023Updated 2 years ago
- 毕业设计:《基于CLIP模型的视频文本检索设计与实现》☆18Jul 21, 2024Updated last year
- The DSDT and SSDTs of Lenovo G470 for hackintosh.☆12Dec 23, 2017Updated 8 years ago
- Leveraging IBM DB2’s Federation Capabilities to Perform SQL Analytics on a Sample Blockchain Insurance Application using Hyperledger Fabr…☆12Sep 17, 2025Updated 6 months ago
- 博客转md格式保存至本地(Save the blog in md format locally)☆24Dec 28, 2020Updated 5 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Spark Streaming + kafka + hbase☆15Nov 19, 2018Updated 7 years ago
- A list of interesting payloads, tips and tricks for bug bounty hunters.☆24Sep 1, 2019Updated 6 years ago
- HanLP: Han Language Processing , Java version☆30Oct 13, 2020Updated 5 years ago
- 手把手教你ShardingSphere入门☆15Nov 13, 2020Updated 5 years ago
- 一个基于SSM框架+Layuimini前端模板开发的酒店管理系统☆21May 10, 2021Updated 4 years ago
- Source code for our AAAI 2020 paper P-SIF: Document Embeddings using Partition Averaging☆35May 2, 2020Updated 5 years ago
- 支持多服务端的Frp Openwrt插件☆20Mar 6, 2024Updated 2 years ago
- TF-IDF+Word2vec做文本相似度计算,最好是长文本☆24Dec 18, 2019Updated 6 years ago
- Generalizable Implicit Hate Speech Detection using Contrastive Learning (COLING 2022)☆14Oct 9, 2022Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- a new Chinese Weibo comments dataset collected from Sina Weibo comment specifically for cyberbullying detection☆15Aug 29, 2019Updated 6 years ago
- HTML5 rich text editor. Try the demo integration at☆20Jun 19, 2019Updated 6 years ago
- 用于深度学习领域图片识别项目的验证码样本数据生成器☆34May 22, 2018Updated 7 years ago
- 大数据组件学习;包括dataflow,spring cloud stream;elasticsearch;flink;spark;kafka;phoenix;Hive;Hbase;☆22Jul 1, 2022Updated 3 years ago
- [CVPR 2023] VoP: Text-Video Co-operative Prompt Tuning for Cross-Modal Retrieval☆38Feb 28, 2023Updated 3 years ago
- 2019年末总结下今年做过的逆向,整理代码,复习思路。拼夕夕Web端anti_content参数逆向分析 WEB淘宝sign逆向分析;努比亚Cookie生成逆向分析;百度指数data加密逆向分析 今日头条WEB端_signature、as、cp参数逆向分析知乎登录formd…☆47Dec 30, 2019Updated 6 years ago
- ☆32Sep 13, 2022Updated 3 years ago