本爬虫用于爬取知乎网站问题、回答的相关字段信息,问题的标题、内容、发布时间、话题、回答数量、评论数、点击数、关注数等字段,及对该问题回答的内容,作者、点赞数、评论数、回答时间等等字段信息。可用于对社会话题、热点进行数据分析。
☆43Nov 30, 2018Updated 7 years ago
Alternatives and similar repositories for zhihuSpider
Users that are interested in zhihuSpider are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 知乎爬虫,可以爬取知乎上特定问题下的所有回答、单个回答,特定用户的所有回答、文章,话题精华,收藏夹,专栏,文章☆76Sep 27, 2019Updated 6 years ago
- 知乎爬虫,用于爬取问题和对应的回答☆28Jan 31, 2023Updated 3 years ago
- 通过爬虫获取某个关键词下的所有公众号文章全文,然后编写一个简易的查重算法,筛选出微信公众号上不重复的文章,降低人为筛选的工作量。☆11Feb 20, 2021Updated 5 years ago
- GOAT(山羊)是中英文大语言模型,基于LlaMa进行SFT。☆12Apr 24, 2023Updated 2 years ago
- ☆10Dec 3, 2020Updated 5 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- 基于哔哩哔哩用户评论的文本情感分析☆14Sep 2, 2023Updated 2 years ago
- Codebase for character-centric story understanding☆14Jan 20, 2022Updated 4 years ago
- Code and data to support Bamman et al. (2020), "A Dataset of Literary Coreference" (LREC)☆10Dec 8, 2022Updated 3 years ago
- 抖音爬虫——采集账号主页、喜欢、收藏、音乐原声、搜索、关注、粉丝、合集、单作品。支持抖音号查询信息(精确粉丝数)。支持搭建API。接口版:post分支☆23Jul 28, 2023Updated 2 years ago
- Encyclopedic Hub for Sentiment Dictionaries☆15Nov 20, 2025Updated 4 months ago
- 抓取淘女郎图片的简单爬虫,对应博文[python爬虫入门教程(三):淘女郎爬虫 ( 接口解析 | 图片下载 )](https://blog.csdn.net/aaronjny/article/details/80291997)。☆11May 13, 2018Updated 7 years ago
- Tool to simplify complex and compound sentences to simple sentences implemented using Python☆19Sep 9, 2020Updated 5 years ago
- 新华网和人民网的简单关键词Scrapy爬虫☆12Jun 2, 2022Updated 3 years ago
- Data and code for the book Enumerations: Data and Literary Study (Chicago 2018)☆26Dec 2, 2018Updated 7 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- 实现功能:新输入一段文本,与已有数据进行相似度进行比较,返回TOP10的文本。主要实现方法:jieba中文分词、gensim、TF-IDF词汇重要性、cosine余弦相似度。☆11Jul 30, 2020Updated 5 years ago
- This is a meta-model distilled from LLMs for information extraction. This is an intermediate checkpoint that can be well-transferred to a…☆29Feb 23, 2025Updated last year
- Scrapes TFRs from FAA site☆21Oct 2, 2024Updated last year
- A Semantic Role Label classifier inspired by the article "Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling…☆23Feb 27, 2019Updated 7 years ago
- 利用python脚本对文本内容进行敏感信息的识别与过滤☆38Jan 12, 2016Updated 10 years ago
- Predict human emotions in tweets by mapping emojis into the Valence-Arousal space (Russell, 2005). LSTM models the sequence learning of w…☆24Dec 8, 2022Updated 3 years ago
- Sharable scripts and stylesheets from the Northeastern University Women Writers Project☆24Apr 7, 2026Updated last week
- This is AutoGenDemo☆11Mar 12, 2024Updated 2 years ago
- flightradar24 GUI client built with Python☆16Nov 17, 2018Updated 7 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Sentence embedding using Smooth Inverse Frequency weighting scheme☆15Feb 21, 2020Updated 6 years ago
- 基于Scrapy+Elasticsearch+Django搭建的分布式电影搜索☆31Jul 25, 2018Updated 7 years ago
- repackage of official CAJviewer☆10Jan 26, 2021Updated 5 years ago
- 爬取知乎个人主页的想法、文篇和回答☆68May 1, 2025Updated 11 months ago
- WSDM‘2022: Knowledge Enhanced Sports Game Summarization☆18Jun 16, 2022Updated 3 years ago
- extract the time domain or frequent domain features from wav format audio☆34Oct 3, 2019Updated 6 years ago
- easymqtt4j , netty +mqtt +subscriber+ publisher +broker+cluster server for java☆12Apr 10, 2026Updated last week
- Node.js app to watch files and directories then sync them to the remote server using rsync☆22Apr 8, 2026Updated last week
- ☆19Jan 7, 2018Updated 8 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Arduino的各种项目,包括灯泡控制等☆10Feb 25, 2020Updated 6 years ago
- Implementations of various sentiment analysis methods in Python.☆33Nov 10, 2017Updated 8 years ago
- A manufacturing management system☆13Jan 25, 2019Updated 7 years ago
- A program to regulate pdf books library programmer qt☆12Oct 12, 2020Updated 5 years ago
- 本项目是一个微博爬虫项目,旨在通过微博的mid获取到其对应的所有点赞、转发、评论与二级评论的相关数据。☆57Oct 14, 2022Updated 3 years ago
- Subscribe to Mosquitto MQTT Broker and Publish data to MySQL Database☆13Jan 28, 2020Updated 6 years ago
- ☆11Aug 10, 2017Updated 8 years ago