apify / actor-page-analyzer
Apify actor that opens a web page in headless Chrome and analyzes the HTML and JavaScript objects, looks for schema.org microdata and JSON-LD metadata, analyzes AJAX requests, etc.
☆150Updated last year
Related projects ⓘ
Alternatives and complementary repositories for actor-page-analyzer
- Rewriting web proxy and archival tool. At this point, it just tries to download all the things.☆199Updated this week
- 📮 Dialogflow + Sendgrid = AI Mailbox☆35Updated 4 years ago
- Scraping assistant tool. Editing and maintaining CSS/XPath selectors across webpages.☆101Updated 6 years ago
- Twitter AI Platform☆93Updated 7 years ago
- Remote client for distributed automated HTTP(s) content fetching.☆77Updated this week
- An OPML file with 22 of the top 25 US newspapers RSS feeds☆55Updated 6 years ago
- CoCrawler is a versatile web crawler built using modern tools and concurrency.☆187Updated 2 years ago
- Google2Csv a simple google scraper that saves the results on a csv/xlsx/jsonl file☆169Updated 4 years ago
- Dashboard is software for creating web apps and SaaS (support @ freenode #userdashboard)☆280Updated 3 years ago
- Scrapy middleware which allows to crawl only new content☆79Updated 2 years ago
- Extract and decompose URLs (including emails, which are conceptually a part of URLs) with robust patterns.☆341Updated 3 months ago
- A web app to create and browse text visualizations for automated customer listening.☆148Updated last year
- JavaScript Library for Google Sheets/Microsoft Excel Online through sheet2api. https://sheet2api.com/☆91Updated last year
- File-system-based database (in the git repo), with a server attached with users and access control for serving this data. See an example …☆64Updated last year
- Kinase is a pluggable browser extension allowing you to label content on the web.☆78Updated 7 years ago
- Sup.js saves URL parameters and inputs them into any form submitted during a visitors session on you site.☆86Updated last year
- Flask code to deploy an API that pulls structured data from online news articles☆230Updated last year
- Export your Hacker News saved links to JSON or CSV from the Chrome console.☆50Updated 8 years ago
- Aviation grade news article metadata extraction☆36Updated last year
- ☆40Updated 3 years ago
- Word analysis, by domain, on the Common Crawl data set for the purpose of finding industry trends☆57Updated 9 months ago
- Extract a list of keywords from a website, sorted by word count.☆51Updated 8 years ago
- An algorithm for generating robust XPath locators for web testing.☆178Updated last year
- Track clicks and other client-side events on web pages☆225Updated 7 years ago