Florents-Tselai / WarcDB
WarcDB: Web crawl data as SQLite databases.
☆399Updated 6 months ago
Alternatives and similar repositories for WarcDB:
Users that are interested in WarcDB are comparing it to the libraries listed below
- A SQLite extension which loads a Google Sheet as a virtual table.☆514Updated 2 years ago
- A SQLite extension for reading large files line-by-line (NDJSON, logs, txt, etc.)☆395Updated last year
- A SQLite extension for querying, manipulating, and creating HTML elements.☆380Updated last year
- Python module to parse ingredient names. Splitting them into the ingredient, unit and quantity. It is trained on a publicly available dat…☆151Updated last year
- Query SQLite files in S3 using s3fs☆496Updated 2 years ago
- Minimalist log collector☆114Updated 2 weeks ago
- Dockerized local and offline backing up of PostgresQL with rotation and compression.☆210Updated last year
- Create a SQLite database containing metadata from Google Drive☆154Updated 2 years ago
- Distributed Embeddable Database☆255Updated 2 years ago
- Simple Python Calculation Engine☆134Updated 2 years ago
- Functional UUIDs for Python.☆148Updated 3 years ago
- Geocode rows in a SQLite database table☆233Updated 2 years ago
- ☆114Updated 3 years ago
- 🗄️ A simple CLI for converting WARC to Parquet.☆108Updated last week
- Web clipper browser extension for saving highlights, screenshots, and automatically extracting content from web pages.☆372Updated 3 years ago
- The various scripts I use to back up my home computers using ssh and rsync☆199Updated 3 years ago
- Use cURL with cookies from Chrome☆330Updated last year
- Scrapy rotation proxy package with advanced functions☆94Updated 2 years ago
- Reverse Geocode for OpenStreetmap☆122Updated 4 months ago
- Rsync-based time machine for Linux, written in Python, for local and remote backups.☆170Updated last year
- A Python library to inspect and modify the internal structure of a PDF file☆426Updated this week
- A SQLite extension for making HTTP requests purely in SQL☆240Updated last year
- Module Oriented Large Archive Specialized Slow Exhaustive Searcher☆113Updated 9 years ago
- Create a website from a git repository in one click☆407Updated 2 years ago
- Repository for Pipes☆268Updated 5 months ago
- a simple website for sharing table data - with an API☆387Updated 2 months ago
- Dirty Little SQL Notebook☆112Updated 2 years ago
- Medical wordlists in EN/FR/UA☆86Updated last year
- Minimalist Error collection Service compatible with Rollbar clients. Sentry or Rollbar alternative.☆388Updated 3 years ago
- A generator for OpenAPI 3☆97Updated 4 years ago