maxcountryman / warc-parquetLinks
ποΈ A simple CLI for converting WARC to Parquet.
β113Updated 9 months ago
Alternatives and similar repositories for warc-parquet
Users that are interested in warc-parquet are comparing it to the libraries listed below
Sorting:
- Scale to zero Seafowl hosting with Cloud Runβ37Updated 2 years ago
- β107Updated 7 months ago
- A safe, stateful rules language for event streamsβ114Updated 2 years ago
- SQLite3 extension for read-only HTTP(S) database accessβ57Updated 2 years ago
- Testing various image matching algorithms' performance on the Pinecone vector DBβ43Updated 2 years ago
- Create a SQLite database containing metadata from Google Driveβ163Updated 8 months ago
- ayb makes it easy to create databases, share them with collaborators, and query them from anywhereβ78Updated last week
- WarcDB: Web crawl data as SQLite databases.β404Updated last year
- the fastest CSV SQLite extension, written in Rustβ140Updated 9 months ago
- Code to accompany blog post https://reorchestrate.com/posts/sqlite-transactionsβ65Updated last year
- jq extension for SQLite.β103Updated last year
- Gavin Mendel-Gleason's blogβ88Updated last year
- Shell scripting for serverlessβ142Updated 3 years ago
- Zig library for HyperLogLog estimationβ91Updated last year
- SQL transformation tool for DuckDB written in Rustβ72Updated 8 months ago
- ZSV Utility for converting json to/from zip-separated-valuesβ56Updated last year
- Beating the `bisect` module's implementation using C-extensions.β30Updated 2 years ago
- β165Updated last year
- The Directed Acyclic Graph Elevation Markup Languageβ81Updated 7 months ago
- Versioning filesystem for Sqliteβ241Updated last year
- A SQLite extension for extracting values from serialized Protobuf messagesβ88Updated 5 months ago
- What if an HNSW index was just a file, and you could serve it from a CDN, and search it directly in the browser?β108Updated 7 months ago
- progscrape.com sourceβ95Updated 3 months ago
- ROAPI user documentationβ55Updated 5 months ago
- Steampipe SQLite is a zero-ETL engine for SQLite. Virtual tables translate queries into live API calls for cloud services and APIs. Hundrβ¦β58Updated 3 months ago
- Block Erasure Format - An extensible, fast, and usable file utility to encode and decode interleaved erasure coded streams of data.β58Updated last year
- Fast similarity search using DuckDBβ142Updated last year
- β52Updated 6 months ago
- Static analysis and LSP for SQL in Rustβ87Updated 2 months ago
- a file transfer service utilizing quicβ68Updated 11 months ago