wolfgangmeyers / go-warcLinks
A golang library to work with WARC files from the common crawl
☆15Updated 7 years ago
Alternatives and similar repositories for go-warc
Users that are interested in go-warc are comparing it to the libraries listed below
Sorting:
- Read and write WARC files in Go☆47Updated 7 years ago
- golang readers for ARC and WARC webarchive formats☆20Updated 2 years ago
- The speed of a native map, the safety of sync.RWMutex and the durability of bbolt☆24Updated 5 years ago
- Serve millions of JSON documents via HTTP.☆70Updated 9 months ago
- A pure Go implementation of the smaz compression library for short strings.☆20Updated 9 years ago
- Miscellaneous tools for processing WARC files from the CommonCrawl☆24Updated 11 years ago
- DNS client & server package for Go☆42Updated 6 years ago
- web-based UI editor for bleve index mappings☆23Updated 3 months ago
- A simple, lightweight, embedded geocoder for Golang with city level accuracy☆73Updated 9 years ago
- adding badger support to blevesearch☆62Updated 2 years ago
- CLD2 (Compact Language Detector 2) bindings for Go (golang)☆38Updated 5 years ago
- Latent Dirichlet Allocation☆31Updated 3 years ago
- Tokenizers and lemmatizers for Go☆110Updated last year
- Removes most frequent words (stop words) from a text content. Based on a Curated list of language statistics.☆149Updated 2 years ago
- Go library for DPoP (OAuth 2.0 Demonstration of Proof-of-Possession at the Application Layer)☆22Updated 5 years ago
- Go package for abstracting local, in-memory, and remote (Google Cloud Storage/S3) filesystems☆52Updated 6 years ago
- Document Indexing and Searching Library in Go☆19Updated 5 years ago
- Package mbox parses the mbox file format into messages and formats messages into mbox files☆73Updated 2 months ago
- dmmclust is a package for clustering short texts, based on Yin and Wang (2014)☆26Updated 7 years ago
- Increasing bleve indexing performance with sharding☆20Updated 7 years ago
- GoShort is a URL shortener written in Golang, BoltDB is used for in memory and persistent key/value storage and for routing it's using h…☆67Updated 5 years ago
- A simple Go Library to calculate a phash string for a JPEG image.☆34Updated 10 years ago
- Go HTML Info package for extracting meaningful information from html page☆35Updated 2 years ago
- A golang phonetics algorithm library☆31Updated 10 years ago
- pure golang spelling based on hunspell dictionaries☆41Updated 9 years ago
- Package secureheader adds some HTTP headers widely considered to improve safety of HTTP requests.☆105Updated 7 years ago
- Pure-Go full text indexer and search library☆93Updated 10 years ago
- Text summarizer for golang using LexRank☆132Updated last year
- Parser for HTML microdata, schema.org☆34Updated 8 years ago
- Small trigram indexer☆33Updated last year