maxcountryman / warc-parquet
ποΈ A simple CLI for converting WARC to Parquet.
β106Updated 2 months ago
Related projects β
Alternatives and complementary repositories for warc-parquet
- β108Updated 3 months ago
- Scale to zero Seafowl hosting with Cloud Runβ39Updated last year
- A safe, stateful rules language for event streamsβ113Updated last year
- ZSV Utility for converting json to/from zip-separated-valuesβ58Updated 5 months ago
- A simple performant GeoIP server written in Rust using MaxMind DBs with auto database updateβ44Updated this week
- What if an HNSW index was just a file, and you could serve it from a CDN, and search it directly in the browser?β85Updated 5 months ago
- Code to accompany blog post https://reorchestrate.com/posts/sqlite-transactionsβ66Updated 3 months ago
- WarcDB: Web crawl data as SQLite databases.β393Updated 3 months ago
- Testing various image matching algorithms' performance on the Pinecone vector DBβ43Updated last year
- Gavin Mendel-Gleason's blogβ86Updated 10 months ago
- Fast similarity search using DuckDBβ106Updated last week
- Create a SQLite database containing metadata from Google Driveβ152Updated 2 years ago
- Zig library for HyperLogLog estimationβ86Updated 3 months ago
- the fastest CSV SQLite extension, written in Rustβ122Updated 11 months ago
- Dumfederated gRPC social network implemented in Rust/Tonic/Diesel with both Flutter and React (web+native) frontends. ππ©EZ to deploy toβ¦β63Updated last month
- β44Updated 2 years ago
- β162Updated 5 months ago
- SQLite3 extension for read-only HTTP(S) database accessβ51Updated 11 months ago
- Reverse Geocode for OpenStreetmapβ121Updated 2 months ago
- A small language that compiles to WebAssembly Text formatβ73Updated 6 months ago
- webidx is a client-side search engine for static websites.β58Updated 9 months ago
- Navigating arbitrarily complex bus systems can be tricky, especially as bus listings and routes are listed in O(N) style wall-posters. Caβ¦β71Updated last year
- A js library to incorporate HN comments to any websiteβ31Updated 6 months ago
- Command-line tool to remotely execute code in the cloudβ134Updated 2 years ago
- Beating the `bisect` module's implementation using C-extensions.β30Updated last year
- a file transfer service utilizing quicβ59Updated last month
- Block Erasure Format - An extensible, fast, and usable file utility to encode and decode interleaved erasure coded streams of data.β57Updated 5 months ago
- A simple to use text only blog using CloudFlare Workers and KVβ78Updated 2 weeks ago
- β41Updated last year