jlubawy / go-boilerpipe
Golang port of the boilerpipe Java library used for the removal of boilerplate and extraction of text content from HTML documents.
☆70Updated last month
Alternatives and similar repositories for go-boilerpipe
Users that are interested in go-boilerpipe are comparing it to the libraries listed below
Sorting:
- An implementation of the Goose HTML Content / Article Extractor algorithm in golang☆40Updated 4 years ago
- Named Entity Recognition for golang via MITIE☆33Updated 6 years ago
- A Go implementation of the readability algorithm by arc90 labs☆133Updated 3 years ago
- Multiclass Naive Bayesian Classification☆75Updated 6 years ago
- A simple, lightweight, embedded geocoder for Golang with city level accuracy☆73Updated 9 years ago
- An example app providing an HTTP/REST/JSON front-end to bleve☆133Updated 2 months ago
- Stemmer packages for Go programming language. Includes English, German and Dutch stemmers.☆53Updated 8 years ago
- simhash storage and searching☆138Updated 8 years ago
- A Go package that implements the JusText boilerplate removal algorithm☆109Updated 2 years ago
- Text summarizer for golang using LexRank☆129Updated last year
- Go Stanford NLP POS Tagger wrapper☆38Updated 8 years ago
- A Go library for performing Unicode Text Segmentation as described in Unicode Standard Annex #29☆89Updated 2 years ago
- Tokenizers and lemmatizers for Go☆110Updated 11 months ago
- go-corenlp is a Golang wrapper for Stanford CoreNLP.☆30Updated 5 years ago
- Summarizes text☆38Updated 9 years ago
- High Performance Porter2 Stemmer☆46Updated 4 years ago
- CLD2 (Compact Language Detector 2) bindings for Go (golang)☆38Updated 5 years ago
- Read and write WARC files in Go☆45Updated 7 years ago
- Ngram index for golang☆114Updated 8 years ago
- Bayesian text classifier with flexible tokenizers and storage backends for Go☆158Updated 5 years ago
- Webpage summary extractor using Facebook Open Graph and arc90's readability☆69Updated 6 years ago
- A Go package for working with headless Chrome. Run interactive JavaScript commands on web pages with Go and Chrome.☆122Updated 6 years ago
- ipLocator - a basic Geo-Ip Server☆71Updated 6 years ago
- Guess the natural language of a text in Go☆58Updated 7 years ago
- Levenshtein Distance in Go☆40Updated 6 years ago
- A generic patricia trie (also called radix tree) implemented in Go (Golang)☆28Updated 5 years ago
- A graph library in Go☆78Updated 5 years ago
- Removes most frequent words (stop words) from a text content. Based on a Curated list of language statistics.☆148Updated last year
- A Go implementation of the WordNet API☆39Updated 6 years ago
- Html Content / Article Extractor in Golang☆442Updated last year