michaelhelmick/lassie

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/michaelhelmick/lassie)

michaelhelmick / lassie

Web Content Retrieval for Humans™

☆629

Alternatives and similar repositories for lassie

Users that are interested in lassie are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

coleifer / micawber
View on GitHub
a small library for extracting rich content from urls
☆681Updated this week
tomekwojcik / envelopes
View on GitHub
Mailing for human beings
☆582Mar 7, 2019Updated 7 years ago
vinta / haul
View on GitHub
An Extensible Image Crawler
☆161Jan 7, 2017Updated 9 years ago
matiasb / demiurge
View on GitHub
PyQuery-based scraping micro-framework.
☆118Jan 14, 2022Updated 4 years ago
deanmalmgren / textract
View on GitHub
extract text from any document. no muss. no fuss.
☆4,669Jul 11, 2026Updated last week
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
Alir3z4 / python-sanitize
View on GitHub
Bringing sanity to world of messed-up data
☆66Oct 7, 2014Updated 11 years ago
scrapy / scrapely
View on GitHub
A pure-python HTML screen-scraping library
☆1,884Apr 4, 2022Updated 4 years ago
martinrusev / imbox
View on GitHub
Python IMAP for Agentic Workflows
☆1,218Jun 23, 2026Updated 3 weeks ago
Zulko / picnic.py
View on GitHub
Easy Python packages creation.
☆249Feb 10, 2020Updated 6 years ago
lorien / grab
View on GitHub
Web Scraping Framework
☆2,461Sep 19, 2025Updated 10 months ago
miso-belica / sumy
View on GitHub
Module for automatic summarization of text documents and HTML pages.
☆3,695Updated this week
andychase / pipeless
View on GitHub
Simple pipeline building framework
☆129Apr 20, 2016Updated 10 years ago
elliotgao2 / toapi
View on GitHub
Every web site provides APIs.
☆3,555Jul 10, 2026Updated last week
mailgun / flanker
View on GitHub
Python email address and Mime parsing library
☆1,651Apr 8, 2026Updated 3 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
wuub / requirementstxt
View on GitHub
Sublime plugin for requirements.txt
☆116Feb 13, 2017Updated 9 years ago
schematics / schematics
View on GitHub
Python Data Structures for Humans™.
☆2,590Jul 12, 2023Updated 3 years ago
ecdavis / pants
View on GitHub
A lightweight framework for writing asynchronous network applications in Python.
☆163Jul 8, 2017Updated 9 years ago
kennethreitz-archive / procs
View on GitHub
Python, Processes, and Prana.
☆226Mar 10, 2015Updated 11 years ago
pydanny / dj-libcloud
View on GitHub
Adds easy python 3 and 2.7 support to Django for management of static assets.
☆53Jan 6, 2017Updated 9 years ago
ssteuteville / scrapyz
View on GitHub
"Scrape Easy" - an extension of the Scrapy framework.
☆185Aug 13, 2016Updated 9 years ago
MechanicalSoup / MechanicalSoup
View on GitHub
A Python library for automating interaction with websites.
☆4,876Jun 26, 2026Updated 3 weeks ago
waawal / undead
View on GitHub
Dead Easy POSIX Daemons for Python (POC)
☆171Jul 30, 2013Updated 12 years ago
jeffknupp / sandman
View on GitHub
Sandman "makes things REST".
☆2,288Dec 25, 2021Updated 4 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Ceasar / twosheds
View on GitHub
Python library for making POSIX shells
☆136Apr 20, 2021Updated 5 years ago
mahmoud / boltons
View on GitHub
🔩 Like builtins, but boltons. 250+ constructs, recipes, and snippets which extend (and rely on nothing but) the Python standard library.…
☆6,906Updated this week
jaimegildesagredo / booby
View on GitHub
Data modeling and validation Python library
☆175Sep 21, 2021Updated 4 years ago
jomido / jogger
View on GitHub
Navigate log files.
☆98Jan 30, 2014Updated 12 years ago
Alir3z4 / html2text
View on GitHub
Convert HTML to Markdown-formatted text.
☆2,169Oct 28, 2025Updated 8 months ago
gabrielfalcao / HTTPretty
View on GitHub
Intercept HTTP requests at the Python socket level. Fakes the whole socket module
☆2,162Jun 9, 2024Updated 2 years ago
stephenmcd / hot-redis
View on GitHub
Rich Python data types for Redis
☆295Apr 3, 2024Updated 2 years ago
charlierguo / gmail
View on GitHub
A Pythonic interface for Google Mail
☆1,800Jul 9, 2023Updated 3 years ago
gruns / furl
View on GitHub
🌐 The easiest way to parse and modify URLs in Python.
☆2,809Feb 22, 2026Updated 4 months ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
codelucas / newspaper
View on GitHub
newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:
☆15,114Jul 8, 2026Updated last week
tellapart / commandr
View on GitHub
Tool that creates command line interfaces for functions automatically.
☆203Nov 27, 2017Updated 8 years ago
isnowfy / pydown
View on GitHub
An HTML5 presentation builder written by python
☆691Apr 17, 2017Updated 9 years ago
ghickman / classify
View on GitHub
☆34Jul 14, 2026Updated last week
paylogic / pip-accel
View on GitHub
pip-accel: Accelerator for pip, the Python package manager
☆306May 26, 2020Updated 6 years ago
yhat / db.py
View on GitHub
db.py is an easier way to interact with your databases
☆1,217Aug 2, 2021Updated 4 years ago
derek73 / python-nameparser
View on GitHub
A simple Python module for parsing human names into their individual components
☆711Updated this week