edgi-govdata-archiving / waybackLinks
A Python API to the Internet Archive Wayback Machine
☆76Updated last year
Alternatives and similar repositories for wayback
Users that are interested in wayback are comparing it to the libraries listed below
Sorting:
- Wayback Machine API interface & a command-line tool☆543Updated last year
- A helper library full of URL-related heuristics.☆70Updated 2 months ago
- Guess gender from first name in Python 2 and 3☆137Updated 2 months ago
- Alternative robots parser module for Python☆18Updated last month
- A simple Python wrapper and command-line interface for archive.org’s "Save Page Now" capturing service☆180Updated 10 months ago
- Extract place names from a URL or text, and add context to those names -- for example distinguishing between a country, region or city.☆131Updated 2 weeks ago
- A webmining CLI tool & library for python.☆333Updated 2 months ago
- A set of utilities for processing MediaWiki XML dump data.☆57Updated 6 months ago
- A maximum-strength name parser for record linkage.☆38Updated 2 months ago
- Public API client for GETTR, a "non-bias [sic] social network," designed for data archival and analysis.☆93Updated last month
- Python wrapper for the MediaWiki API to access and parse data from Wikipedia☆41Updated this week
- Now included in rigour☆151Updated 2 weeks ago
- Parse government documents into well formed JSON☆72Updated this week
- Fast and robust date extraction from web pages, with Python or on the command-line☆136Updated 2 weeks ago
- A repo to collect issues with calmcode.io☆16Updated 5 years ago
- Taupe takes a downloaded Twitter archive ZIP file, extracts the URLs corresponding to tweets, retweets, replies, quote tweets, and liked …☆33Updated 2 years ago
- Effortless conversion between data formats like JSON, XML and CSV☆120Updated 3 years ago
- Examples for getting started using https://case.law☆66Updated 2 years ago
- Dataset: BuzzFeed News “Trending” Strip, 2018–2023☆19Updated 2 years ago
- Python based Wikidata framework for easy dataframe extraction☆45Updated last year
- The little things give you away... A collection of various small helper stuff – Mirror repo only, no longer kept in sync, refer to gitea.…☆25Updated 4 years ago
- Web scraping Page Objects core library☆101Updated this week
- A deep learning model for extracting references from text☆29Updated last year
- Pythonic wrapper for the Google Sheets API☆124Updated 2 months ago
- The country converter (coco) - a Python package for converting country names between different classification schemes.☆240Updated 3 weeks ago
- Datasette plugin providing instructions for exporting data to Jupyter or Observable☆13Updated last year
- Sidewall is a Python library for interacting with the Dimensions search API.☆17Updated 11 months ago
- A multithread Pushshift.io API Wrapper for reddit.com comment and submission searches.☆220Updated 2 years ago
- A database of court reporters, tests and other experiments☆113Updated 2 weeks ago
- A Python Client for collect and parse public data from the Youtube Data API☆81Updated 2 years ago