apify / actor-page-analyzer
Apify actor that opens a web page in headless Chrome and analyzes the HTML and JavaScript objects, looks for schema.org microdata and JSON-LD metadata, analyzes AJAX requests, etc.
☆150Updated 2 years ago
Alternatives and similar repositories for actor-page-analyzer:
Users that are interested in actor-page-analyzer are comparing it to the libraries listed below
- JavaScript Library for Google Sheets/Microsoft Excel Online through sheet2api. https://sheet2api.com/☆92Updated 2 years ago
- File-system-based database (in the git repo), with a server attached with users and access control for serving this data. See an example …☆63Updated 2 years ago
- Rewriting web proxy and archival tool. At this point, it just tries to download all the things.☆202Updated last week
- Twitter AI Platform☆93Updated 7 years ago
- Scrapy rotation proxy package with advanced functions☆95Updated 2 years ago
- Scraping assistant tool. Editing and maintaining CSS/XPath selectors across webpages.☆102Updated 6 years ago
- Kinase is a pluggable browser extension allowing you to label content on the web.☆78Updated 7 years ago
- midas is a framework that enables you to enrich your CSV, JSON or Excel dataset with any web API you can think of.☆53Updated 6 years ago
- Google2Csv a simple google scraper that saves the results on a csv/xlsx/jsonl file☆169Updated 4 years ago
- Dashboard is software for creating web apps and SaaS (support @ freenode #userdashboard)☆282Updated 4 years ago
- Extract and decompose (fuzzy) URLs (including emails, which are conceptually a part of URLs) in texts with Area-Pattern-based modularity☆352Updated 2 months ago
- keywords-extract - Command line tool extract keywords from any web page.☆63Updated 6 years ago
- An algorithm for generating robust XPath locators for web testing.☆183Updated 2 years ago
- House of Apify Scrapers. Generic scraping actors with a simple UI to handle complex web crawling and scraping use cases.☆121Updated 2 years ago
- Sup.js saves URL parameters and inputs them into any form submitted during a visitors session on you site.☆86Updated last year
- Wren enables users to discover and explore daily news stories 🗞️📻 📺☆259Updated 6 years ago
- Track clicks and other client-side events on web pages☆225Updated 7 years ago
- Flask code to deploy an API that pulls structured data from online news articles☆229Updated 2 years ago
- HITs To Work With Mturk☆16Updated 8 years ago
- Distributed crawling infrastructure running on top of severless computation, cloud storage (such as S3) and sophisticated queues.☆429Updated 2 years ago
- OpenFaaS template for headless Chrome and Puppeteer☆91Updated last year
- A repository of email marketing legislation around the world, compiled by EmailOctopus.☆451Updated 4 months ago
- Export your Hacker News saved links to JSON or CSV from the Chrome console.☆51Updated 8 years ago
- Record browser actions then replay immediately. Craft your own custom automation workflows.☆65Updated 5 years ago
- The Google Tag Manager Container for The Next Web (Web + AMP)☆132Updated 8 years ago
- Remote client for distributed automated HTTP(s) content fetching.☆77Updated last week
- Extract a list of keywords from a website, sorted by word count.☆51Updated 8 years ago
- An OPML file with 22 of the top 25 US newspapers RSS feeds☆55Updated 6 years ago
- type with your voice on Mac/Windows/Linux using electronjs and google chrome☆41Updated 4 years ago
- a minimalist toolkit for building scalable, fault tolerant and eventually-consistent microservices☆45Updated last year