jlubawy / go-boilerpipe
Golang port of the boilerpipe Java library used for the removal of boilerplate and extraction of text content from HTML documents.
☆70Updated 9 months ago
Alternatives and similar repositories for go-boilerpipe:
Users that are interested in go-boilerpipe are comparing it to the libraries listed below
- An implementation of the Goose HTML Content / Article Extractor algorithm in golang☆40Updated 3 years ago
- Named Entity Recognition for golang via MITIE☆33Updated 6 years ago
- Multiclass Naive Bayesian Classification☆75Updated 6 years ago
- simhash storage and searching☆138Updated 7 years ago
- A Go implementation of the readability algorithm by arc90 labs☆132Updated 2 years ago
- Text summarizer for golang using LexRank☆126Updated 10 months ago
- High Performance Porter2 Stemmer☆47Updated 4 years ago
- package lingo provides the data structures and algorithms required for natural language processing☆153Updated last year
- A simple, lightweight, embedded geocoder for Golang with city level accuracy☆73Updated 9 years ago
- An approximate string matching library for the Go programming language.☆178Updated 2 years ago
- Stemmer packages for Go programming language. Includes English, German and Dutch stemmers.☆53Updated 8 years ago
- Summarizes text☆38Updated 9 years ago
- A Go implementation of the WordNet API☆39Updated 5 years ago
- Html Content / Article Extractor in Golang☆441Updated 9 months ago
- A generic patricia trie (also called radix tree) implemented in Go (Golang)☆28Updated 5 years ago
- Utilities for working with discrete probability distributions and other tools useful for doing NLP work☆96Updated 13 years ago
- Word Stemming in Go☆82Updated 6 years ago
- Go library for performing computations in word2vec binary models☆196Updated 2 years ago
- Go Stanford NLP POS Tagger wrapper☆38Updated 7 years ago
- Read and write WARC files in Go☆44Updated 6 years ago
- TextRank implementation in Golang with extendable features (summarization, phrase extraction) and multithreading (goroutine).☆207Updated 3 years ago
- A Go package that implements the JusText boilerplate removal algorithm☆108Updated 2 years ago
- Nifty library to manage, query and store RDF triples. Make RDF great again!☆115Updated 5 years ago
- adding badger support to blevesearch☆62Updated last year
- go-corenlp is a Golang wrapper for Stanford CoreNLP.☆30Updated 5 years ago
- Matrix Factorization based recsys in Golang. Because facts are more important than ever☆33Updated 6 years ago
- Tokenizers and lemmatizers for Go☆108Updated 8 months ago
- Ngram index for golang☆114Updated 8 years ago
- Levenshtein Distance in Go☆40Updated 6 years ago
- Removes most frequent words (stop words) from a text content. Based on a Curated list of language statistics.☆147Updated last year