jlubawy / go-boilerpipeLinks
Golang port of the boilerpipe Java library used for the removal of boilerplate and extraction of text content from HTML documents.
☆72Updated 9 months ago
Alternatives and similar repositories for go-boilerpipe
Users that are interested in go-boilerpipe are comparing it to the libraries listed below
Sorting:
- An implementation of the Goose HTML Content / Article Extractor algorithm in golang☆40Updated 4 years ago
- simhash storage and searching☆138Updated 8 years ago
- Text summarizer for golang using LexRank☆137Updated 3 months ago
- A Go implementation of the readability algorithm by arc90 labs☆135Updated 3 years ago
- Removes most frequent words (stop words) from a text content. Based on a Curated list of language statistics.☆152Updated 2 years ago
- mediawiki dump parser for loading up wikipedia data☆108Updated 2 months ago
- A simple, lightweight, embedded geocoder for Golang with city level accuracy☆73Updated 10 years ago
- Ngram index for golang☆114Updated 9 years ago
- Go library for performing computations in word2vec binary models☆203Updated 3 years ago
- package lingo provides the data structures and algorithms required for natural language processing☆158Updated 2 years ago
- Named Entity Recognition for golang via MITIE☆35Updated 7 years ago
- A Go package that implements the JusText boilerplate removal algorithm☆110Updated 3 years ago
- Stemmer packages for Go programming language. Includes English, German and Dutch stemmers.☆54Updated 9 years ago
- Multiclass Naive Bayesian Classification☆77Updated 7 years ago
- Html Content / Article Extractor in Golang☆448Updated 5 months ago
- Spell checking and fuzzy search suggestion written in Go☆389Updated 4 years ago
- Utilities for working with discrete probability distributions and other tools useful for doing NLP work☆95Updated 14 years ago
- A Go library for performing Unicode Text Segmentation as described in Unicode Standard Annex #29☆88Updated 3 years ago
- Pluck text in a fast and intuitive way☆216Updated 6 years ago
- GNU Aspell spell checking library bindings for Go (golang)☆47Updated 5 years ago
- An approximate string matching library for the Go programming language.☆182Updated 3 years ago
- A multilingual command line sentence tokenizer in Golang☆463Updated last year
- Matrix Factorization based recsys in Golang. Because facts are more important than ever☆35Updated 7 years ago
- A Go implementation of the WordNet API☆39Updated 6 years ago
- Summarizes text☆39Updated 10 years ago
- Offline language detection☆47Updated 8 years ago
- dmmclust is a package for clustering short texts, based on Yin and Wang (2014)☆26Updated 8 years ago
- Webpage summary extractor using Facebook Open Graph and arc90's readability☆68Updated 6 years ago
- A Go package for n-gram based text categorization, with support for utf-8 and raw text☆73Updated last year
- CLD2 (Compact Language Detector 2) bindings for Go (golang)☆38Updated 6 years ago