edgi-govdata-archiving / waybackLinks
A Python API to the Internet Archive Wayback Machine
☆78Updated 3 weeks ago
Alternatives and similar repositories for wayback
Users that are interested in wayback are comparing it to the libraries listed below
Sorting:
- Wayback Machine API interface & a command-line tool☆544Updated last year
- Now included in rigour☆151Updated 3 weeks ago
- A simple Python wrapper and command-line interface for archive.org’s "Save Page Now" capturing service☆182Updated 10 months ago
- Extract place names from a URL or text, and add context to those names -- for example distinguishing between a country, region or city.☆130Updated last month
- A helper library full of URL-related heuristics.☆70Updated this week
- Alternative robots parser module for Python☆18Updated this week
- A maximum-strength name parser for record linkage.☆38Updated this week
- Python based Wikidata framework for easy dataframe extraction☆45Updated last year
- A modern Python library for writing maintainable web scrapers.☆249Updated 2 months ago
- Guess gender from first name in Python 2 and 3☆137Updated 3 months ago
- A light-weight wrapper for the Datawrapper API.☆65Updated last year
- API client for Aleph, supports bulk entity and document upload.☆28Updated 10 months ago
- Dataset: BuzzFeed News “Trending” Strip, 2018–2023☆19Updated 2 years ago
- A framework for quick web archiving; canonical repository: https://gitea.arpa.li/JustAnotherArchivist/qwarc☆29Updated 4 years ago
- ⛏ a library for scraping unreliable pages☆213Updated 2 weeks ago
- A webmining CLI tool & library for python.☆334Updated this week
- A base library for building web scrapers for statistical data, and a helper ontology for (primarily Swedish) statistical data.☆14Updated 6 months ago
- A Python implementation of Lunr.js 🌖☆199Updated 5 months ago
- Utility library to turn country names into ISO two-letter codes☆70Updated last month
- Tag news stories based on models trained on the NYT corpus.☆42Updated 2 years ago
- Django Legal Advice Builder is a django app that can be used to create, edit and display multi-step questionaires and display the answers…☆12Updated 2 years ago
- Parse government documents into well formed JSON☆72Updated 3 weeks ago
- A tiny library for Python text normalisation. Useful for ad-hoc text processing.☆153Updated last month
- A Python library for defining rule-based overrides on messy data☆16Updated 3 weeks ago
- The little things give you away... A collection of various small helper stuff – Mirror repo only, no longer kept in sync, refer to gitea.…☆25Updated 4 years ago
- Extract city and country mentions from Text like GeoText without regex, but FlashText, a Aho-Corasick implementation.☆62Updated this week
- Examples for getting started using https://case.law☆67Updated 2 years ago
- The country converter (coco) - a Python package for converting country names between different classification schemes.☆242Updated last month
- An open-source archive that gathers, saves, shares and analyzes news homepages☆144Updated this week
- Libzim binding for Python: read/write ZIM files in Python☆92Updated 4 months ago