slyrz / warc
Read and write WARC files in Go
☆44Updated 6 years ago
Alternatives and similar repositories for warc:
Users that are interested in warc are comparing it to the libraries listed below
- golang readers for ARC and WARC webarchive formats☆21Updated last year
- A golang library to work with WARC files from the common crawl☆15Updated 7 years ago
- Package mbox parses the mbox file format into messages and formats messages into mbox files☆70Updated 2 years ago
- Miscellaneous tools for processing WARC files from the CommonCrawl☆24Updated 11 years ago
- A generic patricia trie (also called radix tree) implemented in Go (Golang)☆28Updated 5 years ago
- mediawiki dump parser for loading up wikipedia data☆103Updated last year
- CLD2 (Compact Language Detector 2) bindings for Go (golang)☆38Updated 5 years ago
- Golang package to extract useful text from a HTML document☆40Updated last year
- High Performance Porter2 Stemmer☆45Updated 4 years ago
- grobotstxt is a native Go port of Google's robots.txt parser and matcher library.☆108Updated 2 years ago
- An implementation of the Goose HTML Content / Article Extractor algorithm in golang☆40Updated 3 years ago
- Levenshtein Distance in Go☆40Updated 6 years ago
- 📖⏎ An efficient and flexible word-wrapping package for Go (golang)☆16Updated 4 years ago
- Stemmer packages for Go programming language. Includes English, German and Dutch stemmers.☆53Updated 8 years ago
- A Go package that implements the JusText boilerplate removal algorithm☆108Updated 2 years ago
- Latent Dirichlet Allocation☆30Updated 3 years ago
- Go package to parse GEDCOM files.☆38Updated 5 months ago
- Takes a full name and splits it into individual name parts☆43Updated 4 months ago
- Command deadleaves finds and prints the import paths of unused Go packages.☆34Updated 8 years ago
- Middleware for keeping track of users, login states and permissions☆89Updated last year
- An approximate string matching library for the Go programming language.☆177Updated 2 years ago
- Parse JPEG data into segments via code or CLI from pure Go. Read/export/write EXIF data. Read XMP and IPTC metadata.☆72Updated 2 years ago
- A simple Go Library to calculate a phash string for a JPEG image.☆34Updated 9 years ago
- Email parsing and mail creation library for golang☆95Updated 11 months ago
- Golang WARC (Web ARChive) Library☆30Updated 5 years ago
- A simple natural sorter for Go Strings☆29Updated 9 years ago
- EU VAT number validation in Go using VIES SOAP service☆26Updated 4 years ago
- ☆21Updated last year
- An event based XML parsing API for Go☆20Updated 10 years ago
- Chrome Automation Library using Google Chrome Remote Debugger API in Go☆85Updated 3 years ago