Neo-Luo/scrapy_baidu

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Neo-Luo/scrapy_baidu)

Neo-Luo / scrapy_baidu

百度网页搜索爬虫（查询结果列表页和详情页抓取，详情页正文提取）

☆24

Alternatives and similar repositories for scrapy_baidu

Users that are interested in scrapy_baidu are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

lan2720 / deadurl_detector
View on GitHub
python检测网站死链
☆11Sep 2, 2015Updated 10 years ago
url2io / url2io-python-sdk
View on GitHub
⛔ [DEPRECATED] URL2io Python SDK，用于网页信息提取，如正文提取
☆41Dec 5, 2020Updated 5 years ago
monkey-wenjun / get_domain_info
View on GitHub
批量查询备案和域名解析的工具
☆14Aug 29, 2018Updated 7 years ago
realzhengyiming / Spider_of_keywordRank
View on GitHub
搜索引擎关键词排位爬虫，包括百度，搜狗，360的搜索引擎关键词排位爬虫，关键词从百度热词中取得，排位分别从三个搜索引擎中抓取。
☆18Oct 10, 2019Updated 6 years ago
ztg1 / huobiapia
View on GitHub
火币 websocket api
☆12May 3, 2019Updated 7 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
chapzq77 / PersonSpiderEngine
View on GitHub
多线程爬取百度，搜狗，bing等浏览器检索的结果，结果保存在轻量级的数据库sqlite中
☆12Jul 21, 2017Updated 9 years ago
JaesonCheng / register_domain
View on GitHub
批量扫描域名是否被注册
☆18Aug 4, 2017Updated 8 years ago
THU-KEG / Xlore2.0
View on GitHub
Xlore2.0 Code[BaiduExtractor, HudongExtractor, WikiExtractor, XloreData, XloreWeb]
☆12Apr 5, 2017Updated 9 years ago
roberchenc / flashsale_python
View on GitHub
淘宝，天猫，小米有品秒杀抢购
☆13Feb 14, 2020Updated 6 years ago
chmod740 / BaiduBaikeSpider
View on GitHub
百度百科多线程爬虫Java源码，数据存储采用了Oracle11g
☆13Feb 23, 2017Updated 9 years ago
1543889217 / nike_spider
View on GitHub
各个主流电商平台商品信息爬虫
☆26May 11, 2020Updated 6 years ago
jbothma / text2onto
View on GitHub
☆15Mar 18, 2012Updated 14 years ago
stuckyb / ontopilot
View on GitHub
☆16Jul 10, 2019Updated 7 years ago
JokeNeverSoke / BVG
View on GitHub
全自动营销号视频生成器
☆11Apr 22, 2021Updated 5 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
dotku / shopnc_cnnewyork
View on GitHub
基于运维舫共享的好商城(201511)版本制作
☆14Dec 23, 2015Updated 10 years ago
majiga / text2kg-uwa
View on GitHub
The University of Western Australia's submission to the ICDM 2019 Knowledge Graph Contest.
☆12Dec 8, 2022Updated 3 years ago
fancaixia / DateTimePicker
View on GitHub
微信小程序日期选择
☆23Oct 13, 2019Updated 6 years ago
cafedeflore / mini_spider
View on GitHub
在调研过程中，经常需要对一些网站进行定向抓取。由于python包含各种强大的库，使用python做定向抓取比较简单。请使用python开发一个迷你定向抓取器mini_spider.py，实现对种子链接的广度优先抓取，并把URL长相符合特定pattern的网页保存到磁盘上。
☆19Jun 24, 2015Updated 11 years ago
xiaohanxxx / Quick-ranking
View on GitHub
网站快速排名
☆11May 16, 2019Updated 7 years ago
MedusaSorcerer / M_downlink
View on GitHub
☆13Aug 22, 2020Updated 5 years ago
nju-websoft / OKELE
View on GitHub
Open Knowledge Enrichment for Long-tail Entities, WWW 2020
☆14Jun 17, 2022Updated 4 years ago
cczhr / ADBConnection
View on GitHub
一款通过adb 实现远程连接android设备的软件
☆13Jul 26, 2020Updated 5 years ago
Tr3jer / AutoHookSpider
View on GitHub
将自动爬虫的结果判断是否属于hooks，并不断抓取url爬啊爬。
☆30Jun 2, 2017Updated 9 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
RainmanJin / HTMLContentExtractor
View on GitHub
网页正文及正文图片提取，基于哈工大的《基于行块分布函数的通用网页正文抽取》算法
☆11Jan 22, 2016Updated 10 years ago
yihuitang / StyleTTS_Mandarin
View on GitHub
Implementation of StyleTTS for Mandarin
☆11Jun 22, 2023Updated 3 years ago
liuhuang31 / g2pw_once
View on GitHub
G2pw's inference speed is accelerated by about 8-10 times. Change loop generated predictive data to only once and model loop prediction b…
☆14Dec 30, 2023Updated 2 years ago
starFalll / Spider
View on GitHub
新浪微博爬虫(Sina weibo spider)，百度搜索结果爬虫
☆195Jul 17, 2023Updated 3 years ago
tentenco / baidu-seo
View on GitHub
百度快排 - Baidu SEO
☆23May 3, 2021Updated 5 years ago
iotjin / jh-uniapp-demo
View on GitHub
uniapp项目 - 实现一些常用效果、封装通用组件和工具类
☆16Mar 7, 2022Updated 4 years ago
django-oscar / django-oscar-promotions
View on GitHub
☆19May 1, 2023Updated 3 years ago
AlanConstantine / SomeUsefulPyFile
View on GitHub
Some very useful python code files.
☆18Aug 20, 2017Updated 8 years ago
crossin / games100
View on GitHub
100 game demos by Crossin的编程教室
☆15Jun 4, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
SiriusZHT / Chrome-Extension-NeverMind
View on GitHub
这个Chrome插件能够让你跳过验证码实现用户信息自动输入登陆，还等什么，快来玩啊！
☆12Feb 17, 2022Updated 4 years ago
pyygithub / pyy
View on GitHub
基于Spring Boot、Spring Cloud Alibaba、Vue.js 、Element UI实现，采用前后端分离架构的权限管理系统，代码快速生成平台。
☆14Mar 3, 2023Updated 3 years ago
hoangnguyen1247 / multer-minio-storage
View on GitHub
Multer storage engine for MinIO
☆12Apr 18, 2023Updated 3 years ago
jeffhj / domain-relevance
View on GitHub
The implementation for "Measuring Fine-Grained Domain Relevance of Terms: A Hierarchical Core-Fringe Approach" (ACL '21)
☆16Jun 13, 2021Updated 5 years ago
zikwall / m3u-content-parser
View on GitHub
PHP parser m3u content
☆12Apr 6, 2022Updated 4 years ago
4DBA / GroupSendSMS
View on GitHub
基于adb实现安卓机的群发短信
☆11Oct 14, 2020Updated 5 years ago
rio-2607 / baidu_spider
View on GitHub
一个用BeautifulSoup写的简单的爬取百度搜索结果的爬虫
☆20Jul 29, 2015Updated 10 years ago