kbullaughey / warc-toolsLinks
Miscellaneous tools for processing WARC files from the CommonCrawl
☆24Updated 11 years ago
Alternatives and similar repositories for warc-tools
Users that are interested in warc-tools are comparing it to the libraries listed below
Sorting:
- Summarizes text☆39Updated 9 years ago
- [Go] FreeTree - generic binary-search-tree without any GC overhead☆45Updated 9 years ago
- Chrome Automation Library using Google Chrome Remote Debugger API in Go☆85Updated 3 years ago
- Follow Twitter users based on keywords☆29Updated 8 years ago
- 🏠 Explode one-line address strings using Golang☆53Updated 2 years ago
- Fast identification of character sequences in text or documents (multi-lingual)☆18Updated 9 years ago
- Render screenshots of given urls on Linux using Xvfb, midori, ratpoison & ImageMagick☆83Updated 9 years ago
- A simple domain name server to tolerate typos in subdomains written in Go☆52Updated 9 years ago
- A go package to parse human-readble date and time strings☆54Updated 5 years ago
- A simple library for loading word2vec binary model.☆12Updated 9 years ago
- Go known-keys fast-lookup map generator☆46Updated 4 years ago
- Deal with CLI prompts in style☆40Updated 6 years ago
- 💧 In memory dataset filtering☆49Updated 3 years ago
- ipLocator - a basic Geo-Ip Server☆71Updated 6 years ago
- Package atomicfile provides an atomically written/replaced file.☆51Updated 7 years ago
- Weighted PageRank implementation in Go☆86Updated 4 years ago
- Super simple, concurrent worker queue in golang☆68Updated 5 years ago
- A simple, file-based Database Management System (DBMS) for Go☆32Updated 10 years ago
- A Go library which determines the dominant colors in an image.☆19Updated 10 years ago
- A simple, lightweight, embedded geocoder for Golang with city level accuracy☆73Updated 9 years ago
- golang smtp server that just writes every incoming email to a text file☆44Updated 9 years ago
- interactive, configurable content-blocking proxy written in golang☆95Updated 9 years ago
- Simple Go implementation of the Porter Stemmer algorithm with powerful features.☆27Updated 4 years ago
- A robust framing and encryption layer for your Go network programs, based on CurveZMQ.☆36Updated 7 years ago
- Serve millions of JSON documents via HTTP.☆70Updated 9 months ago
- Go package for abstracting local, in-memory, and remote (Google Cloud Storage/S3) filesystems☆52Updated 6 years ago
- Cross-platform persistent and distributed web crawler☆112Updated 7 years ago
- A pure Go implementation of the smaz compression library for short strings.☆20Updated 9 years ago
- An IP lookup system utilizing open datasets☆61Updated 3 years ago
- A probabilistic data structure service and storage☆92Updated 9 years ago