List of libraries, tools and APIs for web scraping and data processing.
☆13Sep 17, 2015Updated 10 years ago
Alternatives and similar repositories for awesome-web-scraping
Users that are interested in awesome-web-scraping are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Cursos☆10Feb 27, 2019Updated 7 years ago
- This is an API for a todo list application implemented using API Star☆12Dec 26, 2022Updated 3 years ago
- Vinta's ESLint and Prettier shareable configs.☆23Feb 19, 2024Updated 2 years ago
- Docker Image with Matlab Compiler Runtime and SSHD☆15Aug 28, 2014Updated 11 years ago
- scrapy-extras -- a collection of code samples and modules for the Scrapy framework.☆14Dec 14, 2020Updated 5 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- A Scrapy pipeline to categorize items using MonkeyLearn☆38Apr 28, 2017Updated 9 years ago
- Shows how to encrypt data held in public space☆11Aug 11, 2017Updated 8 years ago
- A simple and fast Oh-My-Zsh theme☆26Nov 17, 2018Updated 7 years ago
- A awesome list of (large-scale) public datasets on the Internet. (On-going collection)☆24Feb 18, 2022Updated 4 years ago
- litrl browser and detectors☆10Oct 5, 2023Updated 2 years ago
- Tower Sim & Entry for 10k Apart 2016☆12Dec 3, 2019Updated 6 years ago
- Data and code used in Yarkoni (2019) -- "The Generalizability Crisis"☆13Nov 22, 2019Updated 6 years ago
- Templates for academic documents in Pandoc Markdown☆15Jan 31, 2019Updated 7 years ago
- command line dictionary written in python.☆19Jun 20, 2015Updated 10 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- A CLI for dealing with the features of ScrapingHub☆16Apr 20, 2021Updated 5 years ago
- A decorator to write coroutine-like spider callbacks.☆109Dec 26, 2022Updated 3 years ago
- csharp-functional provides a set of NuGet packages to drive your coding towards a functional approach as well as enabling Railway Oriente…☆11Jul 12, 2022Updated 3 years ago
- Introductory tutorial creating a narrative to the RStudio's tutorial and other documentation for newbies to R's wonderful package shiny.☆24Jan 27, 2015Updated 11 years ago
- Presentation for the NYU Data Lab December 2015☆14Dec 2, 2015Updated 10 years ago
- A JSON API to tag a sentence with part of speech tags. Uses UDPipe, so support for hundreds of languages.☆14Dec 2, 2024Updated last year
- Schedules fetching all repos in a working folder from Git, and optionally pulls changes if there are no conflicts.☆11Jan 5, 2023Updated 3 years ago
- A linter for Scrapy projects.☆22Feb 25, 2026Updated 3 months ago
- Stripe Identity Verification API demo app hosted on Codesandbox☆11Jan 6, 2023Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆10Nov 2, 2016Updated 9 years ago
- Pseudo-localization tool for .NET☆16Updated this week
- Joint estimation of sentiment and topics in textual data☆14Aug 9, 2023Updated 2 years ago
- i will post updates on my instagram @unkn0wn_bali tufhub - a hacking framework with all kinds of bruteforce, info gather, dos attack,…☆13Nov 28, 2018Updated 7 years ago
- The simMixedDAG package enables simulation of "real life" datasets from DAGs☆13Oct 12, 2019Updated 6 years ago
- 🛠 Useful R functions for various things☆18Jul 4, 2019Updated 6 years ago
- This repository has been transferred to jeroendmulder.github.io/RI-CLPM for easier maintenance. The Github Pages automatically redirects …☆13Jul 20, 2022Updated 3 years ago
- PubPeer Chrome browser extension☆14Feb 18, 2025Updated last year
- A base library for building web scrapers for statistical data, and a helper ontology for (primarily Swedish) statistical data.☆15Feb 28, 2025Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Python implementation of the Parsley language for extracting structured data from web pages☆92Oct 26, 2017Updated 8 years ago
- Tool to flatten stream of JSON-like objects, configured via schema☆33Oct 19, 2019Updated 6 years ago
- mutation testing for R☆16Nov 11, 2024Updated last year
- Classify Twitter accounts as institutional or ordinary users.☆12Nov 16, 2018Updated 7 years ago
- A GitHub Action that lints Python code with Flake8 then automatically creates pull request reviews if there are any violations.☆27Apr 20, 2022Updated 4 years ago
- An R package to gather, munge, and convert event datasets into temporal event-networks.☆11Mar 28, 2018Updated 8 years ago
- Swedish data☆14May 6, 2026Updated 3 weeks ago