本爬虫用于爬取知乎网站问题、回答的相关字段信息,问题的标题、内容、发布时间、话题、回答数量、评论数、点击数、关注数等字段,及对该问题回答的内容,作者、点赞数、评论数、回答时间等等字段信息。可用于对社会话题、热点进行数据分析。
☆43Nov 30, 2018Updated 7 years ago
Alternatives and similar repositories for zhihuSpider
Users that are interested in zhihuSpider are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 知乎爬虫,可以爬取知乎上特定问题下的所有回答、单个回答,特定用户的所有回答、文章,话题精华,收藏夹,专栏,文章☆77Sep 27, 2019Updated 6 years ago
- 知乎爬虫,用于爬取问题和对应的回答☆27Jan 31, 2023Updated 3 years ago
- 通过爬虫获取某个关键词下的所有公众号文章全文,然后编写一个简易的查重算法,筛选出微信公众号上不重复的文章,降低人为筛选的工作量。☆11Feb 20, 2021Updated 5 years ago
- GOAT(山羊)是中英文大语言模型,基于LlaMa进行SFT。☆12Apr 24, 2023Updated 3 years ago
- 基于哔哩哔哩用户评论的文本情感分析☆14Sep 2, 2023Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- 抖音爬虫——采集账号主页、喜欢、收藏、音乐原声、搜索、关注、粉丝、合集、单作品。支持抖音号查询信息(精确粉丝数)。支持搭建API。接口版:post分支☆24Jul 28, 2023Updated 2 years ago
- 知乎爬虫---知乎点赞数超过1000的问题及回答,知乎神回复☆23May 10, 2016Updated 10 years ago
- 抓取淘女郎图片的简单爬虫,对应博文[python爬虫入门教程(三):淘女郎爬虫 ( 接口解析 | 图片下载 )](https://blog.csdn.net/aaronjny/article/details/80291997)。☆11May 13, 2018Updated 8 years ago
- 1,huaproject算福利吧,爬取的中国校花网,并且保存到本地,基础知识点,url,json,文件的读写. 2,Document.doc 是自己总结的常见爬虫面试题以及答案,但是貌似不想做全职爬虫,所以可能以后也不会更新这一块,爬虫算乐趣, 以后估计重心会放在web …☆14Jan 24, 2018Updated 8 years ago
- 实现功能:新输入一段文本,与已有数据进行相似度进行比较,返回TOP10的文本。主要实现方法:jieba中文分词、gensim、TF-IDF词汇重要性、cosine余弦相似度。☆11Jul 30, 2020Updated 5 years ago
- Scrapes TFRs from FAA site☆21Oct 2, 2024Updated last year
- 2020 阿里云天池大数据竞赛-中医药文献问题生成挑战赛☆30Sep 2, 2021Updated 4 years ago
- flightradar24 GUI client built with Python☆16Nov 17, 2018Updated 7 years ago
- Sentence embedding using Smooth Inverse Frequency weighting scheme☆15Feb 21, 2020Updated 6 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- 基于Scrapy+Elasticsearch+Django搭建的分布式电影搜索☆31Jul 25, 2018Updated 7 years ago
- 爬取知乎个人主页的想法、文篇和回答☆73May 1, 2025Updated last year
- WSDM‘2022: Knowledge Enhanced Sports Game Summarization☆17Jun 16, 2022Updated 4 years ago
- 知乎模拟登录,支持提取验证码和保存 Cookies☆357Jul 27, 2022Updated 3 years ago
- Node.js app to watch files and directories then sync them to the remote server using rsync☆22Apr 8, 2026Updated 2 months ago
- ☆19Jan 7, 2018Updated 8 years ago
- My implementation for Berkeley AI Pacman projects No. 1 and No. 2☆15Oct 28, 2019Updated 6 years ago
- 边缘计算服务部署与实验☆14Aug 19, 2019Updated 6 years ago
- 自写爬虫爬取知乎问题及回答☆39Jun 10, 2019Updated 7 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Unsupervised text segmentation based on Latent Dirichlet Allocation and Topic Tiling☆24Aug 6, 2016Updated 9 years ago
- A list of ethics related resources for researchers and practitioners of Natural Language Processing and Computational Linguistics☆34Oct 20, 2025Updated 7 months ago
- 基于BERT模型的中文文本情感分类☆40Oct 29, 2022Updated 3 years ago
- 爬取csdn博客的爬虫☆25Oct 14, 2019Updated 6 years ago
- Uses GloVe embeddings and greedy sequence segmentation to semantically segment a text document into any number of k segments.☆33Feb 17, 2019Updated 7 years ago
- Docker container for ADS-B☆30Sep 2, 2020Updated 5 years ago
- “谛听”(discern)资产识别分析平台,一个简化版的物联网设备信息安全搜索引擎,IOT—Scanner的迭代优化版本。目前集成了主机发现、端口扫描、设备识别、漏洞匹配、poc验证等功能。☆18Feb 6, 2021Updated 5 years ago
- 无cookie版微博爬虫,可以连续爬取一个或多个新浪微博用户信息、用户微博及其微博评论转发。☆166Apr 8, 2022Updated 4 years ago
- 开源QG系统(Question Generation,问题生成),基于Pytorch和Transformer编写☆55Jul 25, 2024Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- 知乎大数据分析与热点生成。☆331Mar 21, 2017Updated 9 years ago
- Implementation of the work presented in the AAAI 2019 paper: "Predicting Hurricane Trajectories Using a Recurrent Neural Network"☆23Jan 4, 2022Updated 4 years ago
- 基于celery大规模爬虫☆10Feb 16, 2020Updated 6 years ago
- 生死簿管理系统☆11Jun 23, 2019Updated 6 years ago
- 基于Flask、Celery开发的Ansible api☆12Aug 16, 2016Updated 9 years ago
- [ACL 2025] FinMME: Benchmark Dataset for Financial Multi-Modal Reasoning Evaluation☆67Jun 19, 2025Updated last year
- 知乎用户爬虫数据分析☆15Nov 12, 2017Updated 8 years ago