Florents-Tselai / WarcDB
WarcDB: Web crawl data as SQLite databases.
☆394Updated 4 months ago
Alternatives and similar repositories for WarcDB:
Users that are interested in WarcDB are comparing it to the libraries listed below
- A SQLite extension which loads a Google Sheet as a virtual table.☆510Updated 2 years ago
- A SQLite extension for reading large files line-by-line (NDJSON, logs, txt, etc.)☆392Updated last year
- Query SQLite files in S3 using s3fs☆487Updated 2 years ago
- A SQLite extension for querying, manipulating, and creating HTML elements.☆375Updated last year
- 🗄️ A simple CLI for converting WARC to Parquet.☆106Updated this week
- Python module to parse ingredient names. Splitting them into the ingredient, unit and quantity. It is trained on a publicly available dat…☆151Updated last year
- A Python library to inspect and modify the internal structure of a PDF file☆423Updated 2 months ago
- Geocode rows in a SQLite database table☆232Updated 2 years ago
- Create a SQLite database containing metadata from Google Drive☆153Updated 2 years ago
- A self-hosted live video streaming platform with Discord authentication, auto-recording and more!☆348Updated 4 months ago
- ☆114Updated 3 years ago
- Shell scripting for serverless☆141Updated 2 years ago
- Verneuil is a VFS extension for SQLite that asynchronously replicates databases to S3-compatible blob stores.☆452Updated last month
- Functional UUIDs for Python.☆146Updated 3 years ago
- The various scripts I use to back up my home computers using ssh and rsync☆199Updated 3 years ago
- Use cURL with cookies from Chrome☆330Updated last year
- Minimalist log collector☆113Updated 4 months ago
- Module Oriented Large Archive Specialized Slow Exhaustive Searcher☆113Updated 9 years ago
- Template repository for setting up shot-scraper☆236Updated 8 months ago
- Dirty Little SQL Notebook☆111Updated 2 years ago
- Dockerized local and offline backing up of PostgresQL with rotation and compression.☆210Updated last year
- Create huge Sqlite indexes at breakneck speeds☆181Updated 6 months ago
- Execute a command when something changes☆459Updated last year
- α-Indirect Control in Onion-like Networks☆149Updated 11 months ago
- Rsync-based time machine for Linux, written in Python, for local and remote backups.☆168Updated last year
- Minimal, no-JS web forum software☆661Updated 10 months ago
- ☆229Updated 2 years ago
- Create a website from a git repository in one click☆406Updated last year