scrapinghub/shublang

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/scrapinghub/shublang)

scrapinghub / shublang

Pluggable DSL that uses pipes to perform a series of linear transformations to extract data

☆16

Alternatives and similar repositories for shublang

Users that are interested in shublang are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

andresionek91 / data-scientist-value
View on GitHub
Flask app to calculate compensation of a data scientist
☆12Dec 27, 2022Updated 3 years ago
scrapinghub / scrapy-poet
View on GitHub
Page Object pattern for Scrapy
☆127Jun 8, 2026Updated last month
scrapy-plugins / scrapy-zyte-api
View on GitHub
Zyte API integration for Scrapy
☆43Updated this week
datasprints / dremio-sdk-js
View on GitHub
Dremio SDK for JavaScript
☆26Jun 1, 2020Updated 6 years ago
andresionek91 / Job-Listing-Scraper
View on GitHub
Scraps jobs listings from Glassdoor
☆33Nov 21, 2019Updated 6 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
scrapinghub / scrapy-autoextract
View on GitHub
Zyte Automatic Extraction integration for Scrapy
☆58Apr 13, 2026Updated 3 months ago
scrapinghub / web-poet
View on GitHub
Web scraping Page Objects core library
☆107Jul 10, 2026Updated 2 weeks ago
olist / work-at-olist-data
View on GitHub
Apply for a job at Olist's Data Team: https://olist.gupy.io/
☆56Mar 4, 2022Updated 4 years ago
zytedata / python-zyte-api
View on GitHub
Python client for Zyte API
☆30Updated this week
scrapinghub / shub-workflow
View on GitHub
☆14Jul 16, 2026Updated last week
Tiendil / smart-imports
View on GitHub
smart imports for Python
☆38Nov 8, 2021Updated 4 years ago
inferlink / landmark-extractor
View on GitHub
☆11May 31, 2019Updated 7 years ago
unipampa-lesse / thoth-legacy
View on GitHub
Thoth: systematic review tool
☆12May 8, 2025Updated last year
ppgcc / GenerateReportBib
View on GitHub
Script que verifica se os arquivos .bib estão seguindo as regras das referências conforme o template LaTeX do PPGCC/PUCRS.
☆17Oct 18, 2021Updated 4 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Nykakin / chompjs
View on GitHub
Parsing JavaScript objects into Python data structures
☆222May 17, 2026Updated 2 months ago
scrapinghub / autologin
View on GitHub
A project to attempt to automatically login to a website given a single seed
☆11Jun 17, 2024Updated 2 years ago
wemake-services / wemake-django-rest
View on GitHub
Create Django REST APIs the right way, no magic intended
☆11Dec 8, 2022Updated 3 years ago
CodingCrush / AioCrawler
View on GitHub
Async crawler framework based on aiohttp and asyncio for running fast.
☆12Sep 3, 2017Updated 8 years ago
zahlabut / LogTool
View on GitHub
Openstack logs - export errors and other usefully modes
☆14Jun 3, 2026Updated last month
solidOptionOS / solid-scripts
View on GitHub
A collection of scripts for linux and unix
☆12Aug 17, 2020Updated 5 years ago
flatsiedatsie / webthings-network-presence-detection
View on GitHub
Devices on the local network can be added as a 'thing' in the Candle Controller / WebThings Gateway. Automations can then respond to thei…
☆12Feb 9, 2026Updated 5 months ago
asajid03 / Lets-Defend-Solutions
View on GitHub
The "Let's-defend-solution" directory contains the answers to all paths of the Let's Defend platform that were saved by the creator 8 mon…
☆13Apr 27, 2023Updated 3 years ago
zsen-njsmi / Khed
View on GitHub
Khed is an easy to use, free anime downloader, supporting episodes playlists and resumable downloads.
☆14May 12, 2021Updated 5 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
apify / actor-scrapy-executor
View on GitHub
Apify actor to run web spiders written in Python in the Scrapy library
☆13Dec 11, 2022Updated 3 years ago
goutomroy / digging_asyncio
View on GitHub
python 3.7 asyncio tutorial.
☆14Aug 24, 2019Updated 6 years ago
gunesmes / python-selenium-behave-page-object-docker
View on GitHub
Run your Selenium BDD (Behaviour Driven Development) test cases in Docker. Python, Selenium, Behave, Chrome, Docker. Page object mode (PO…
☆12Nov 5, 2024Updated last year
weedle1912 / detection-and-tracking
View on GitHub
Autonomous object tracking: A combination of a detector and a tracker
☆14May 31, 2018Updated 8 years ago
Ivana- / Liscript-Python
View on GitHub
Liscript command line REPL on Python
☆13Jul 14, 2019Updated 7 years ago
AILab-FOI / B.A.R.I.C.A.
View on GitHub
B.A.R.I.C.A. is an acronym standing for "Beautiful ARtificial Intelligence Cognitive Agent"
☆11Apr 13, 2026Updated 3 months ago
Sl0v3C / PriceSpider
View on GitHub
Price Spider is a Python tool to get price & promotion from JD, Tmall, Amazon, BeiBei
☆10Jun 14, 2019Updated 7 years ago
Tyrone-Zhao / crawlerUtils
View on GitHub
Utils for programming web crawler
☆11May 16, 2019Updated 7 years ago
Anakeyn / Bert_Squad_SEO
View on GitHub
This tool provide a "Bert Score" for first max 30 pages responding to a question in Google
☆13Feb 10, 2020Updated 6 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
logicalhacking / ExtensionCrawler
View on GitHub
A collection of utilities for downloading and analyzing browser extension from the Chrome Web store.
☆20Oct 10, 2023Updated 2 years ago
bnortman / useful-utilities
View on GitHub
A Set of useful tools for Development. Enabling you to pull down multiple organizational source code repositories by generating clone and…
☆18May 6, 2019Updated 7 years ago
kaoticrequiem / fiercecroissant
View on GitHub
A Pastebin scraper designed to look for malicious content
☆20Nov 20, 2019Updated 6 years ago
sgrieve / ScholarDOI
View on GitHub
A Chrome extension which adds DOI support to Google Scholar
☆29Mar 2, 2020Updated 6 years ago
scrapinghub / spidermon
View on GitHub
Scrapy Extension for monitoring spiders execution.
☆561May 28, 2026Updated last month
mathculthello / math.ru
View on GitHub
Сайт math.ru
☆14Jan 7, 2023Updated 3 years ago
rafatbiin / newspaper-crawler
View on GitHub
Scrapy based crawler which crawls newspaper.
☆20Mar 21, 2026Updated 4 months ago