wolfgangmeyers / go-warc
A golang library to work with WARC files from the common crawl
☆14Updated 6 years ago
Alternatives and similar repositories for go-warc:
Users that are interested in go-warc are comparing it to the libraries listed below
- Read and write WARC files in Go☆44Updated 6 years ago
- golang readers for ARC and WARC webarchive formats☆21Updated last year
- The speed of a native map, the safety of sync.RWMutex and the durability of bbolt☆24Updated 4 years ago
- A pure Go implementation of the smaz compression library for short strings.☆20Updated 9 years ago
- Increasing bleve indexing performance with sharding☆20Updated 6 years ago
- go library for opening URLs in web browsers☆10Updated 8 years ago
- Miscellaneous tools for processing WARC files from the CommonCrawl☆24Updated 11 years ago
- A simple tool to collect and process quite a few web news from multiple sources☆34Updated 2 years ago
- dirscanner is a recursive file lister which uses channels for go.☆25Updated 5 years ago
- DNS client & server package for Go☆41Updated 5 years ago
- Parser for HTML microdata, schema.org☆34Updated 8 years ago
- dmmclust is a package for clustering short texts, based on Yin and Wang (2014)☆25Updated 7 years ago
- Package mbox parses the mbox file format into messages and formats messages into mbox files☆70Updated 2 years ago
- Command deadleaves finds and prints the import paths of unused Go packages.☆34Updated 8 years ago
- Package secureheader adds some HTTP headers widely considered to improve safety of HTTP requests.☆106Updated 6 years ago
- Serve millions of JSON documents via HTTP.☆66Updated 2 months ago
- Take screenshot of a web page☆21Updated 7 years ago
- singlefile implements a host wide locking mechanism.☆34Updated 9 years ago
- Pack a Go workflow/function as a Unix-style pipeline command☆55Updated 4 months ago
- Sorting with progress☆10Updated 2 years ago
- A permissions system for Go structs☆15Updated 6 years ago
- Pure Go implementation of cryptographic APIs found in libsodium☆45Updated 4 years ago
- Fast identification of character sequences in text or documents (multi-lingual)☆18Updated 8 years ago
- A Go package for n-gram based text categorization, with support for utf-8 and raw text☆73Updated last month
- A Go package that implements the JusText boilerplate removal algorithm☆107Updated 2 years ago
- Replication for Boltdb databases.☆28Updated 7 years ago
- Small wrapper around golang.org/x/crypto/openpgp☆22Updated 8 years ago
- CloudyKit Router fastest router for go☆23Updated 7 years ago
- Simple library to catch stdout/stderr in Go☆16Updated 5 years ago
- IMAP client and server implementation in Go☆30Updated 10 years ago