edgi-govdata-archiving / waybackLinks
A Python API to the Internet Archive Wayback Machine
☆80Updated 2 weeks ago
Alternatives and similar repositories for wayback
Users that are interested in wayback are comparing it to the libraries listed below
Sorting:
- A helper library full of URL-related heuristics.☆73Updated last month
- A maximum-strength name parser for record linkage.☆39Updated 2 months ago
- Extract place names from a URL or text, and add context to those names -- for example distinguishing between a country, region or city.☆129Updated 3 weeks ago
- Now included in rigour☆153Updated 2 months ago
- 🌬️urlExpander is a Python package for expanding shortened links (urls).☆76Updated 3 years ago
- Alternative robots parser module for Python☆20Updated this week
- Python wrapper for the MediaWiki API to access and parse data from Wikipedia☆42Updated 2 months ago
- A webmining CLI tool & library for python.☆340Updated last week
- Guess gender from first name in Python 2 and 3☆138Updated 5 months ago
- Fast and robust date extraction from web pages, with Python or on the command-line☆142Updated last week
- The little things give you away... A collection of various small helper stuff – Mirror repo only, no longer kept in sync, refer to gitea.…☆24Updated 5 years ago
- Python based Wikidata framework for easy dataframe extraction☆45Updated last year
- A framework for quick web archiving; canonical repository: https://gitea.arpa.li/JustAnotherArchivist/qwarc☆30Updated 4 years ago
- A repo to collect issues with calmcode.io☆16Updated 5 years ago
- Python client for the Center for Responsive Politics API at OpenSecrets.org.☆42Updated 5 years ago
- Public API client for GETTR, a "non-bias [sic] social network," designed for data archival and analysis.☆96Updated 4 months ago
- A Python library for defining rule-based overrides on messy data☆16Updated 2 months ago
- Collection of common Python utility functions and classes used in other Caltech Library programs.☆19Updated 2 years ago
- A deep learning model for extracting references from text☆29Updated 2 years ago
- Some tools to help analyze the twitter archive☆64Updated 5 months ago
- Extract city and country mentions from Text like GeoText without regex, but FlashText, a Aho-Corasick implementation.☆62Updated this week
- A modern Python library for writing maintainable web scrapers.☆247Updated 2 weeks ago
- A Python Client for collect and parse public data from the Youtube Data API☆81Updated 2 years ago
- A base library for building web scrapers for statistical data, and a helper ontology for (primarily Swedish) statistical data.☆14Updated 8 months ago
- API client for Aleph, supports bulk entity and document upload.☆28Updated last year
- Dataset: BuzzFeed News “Trending” Strip, 2018–2023☆19Updated 2 years ago
- Estimating the age of web resources☆96Updated 5 months ago
- Examples for getting started using https://case.law☆69Updated 3 years ago
- The country converter (coco) - a Python package for converting country names between different classification schemes.☆249Updated 2 weeks ago
- Write Datasette canned queries as plain SQL files☆14Updated 3 years ago