jlubawy / go-boilerpipe
Golang port of the boilerpipe Java library used for the removal of boilerplate and extraction of text content from HTML documents.
☆70Updated 7 months ago
Related projects ⓘ
Alternatives and complementary repositories for go-boilerpipe
- An implementation of the Goose HTML Content / Article Extractor algorithm in golang☆40Updated 3 years ago
- A Go implementation of the readability algorithm by arc90 labs☆132Updated 2 years ago
- Html Content / Article Extractor in Golang☆438Updated 7 months ago
- Pluck text in a fast and intuitive way☆215Updated 5 years ago
- CLD2 (Compact Language Detector 2) bindings for Go (golang)☆38Updated 5 years ago
- Named Entity Recognition for golang via MITIE☆33Updated 6 years ago
- Read and write WARC files in Go☆41Updated 6 years ago
- A simple, lightweight, embedded geocoder for Golang with city level accuracy☆72Updated 9 years ago
- Multiclass Naive Bayesian Classification☆75Updated 6 years ago
- Summarizes text☆38Updated 9 years ago
- A Go package that implements the JusText boilerplate removal algorithm☆102Updated 2 years ago
- Stemmer packages for Go programming language. Includes English, German and Dutch stemmers.☆53Updated 7 years ago
- Offline language detection☆47Updated 7 years ago
- simhash storage and searching☆138Updated 7 years ago
- Ngram index for golang☆114Updated 8 years ago
- High Performance Porter2 Stemmer☆46Updated 4 years ago
- Text summarizer for golang using LexRank☆126Updated 7 months ago
- Bayesian text classifier with flexible tokenizers and storage backends for Go☆158Updated 4 years ago
- An example app providing an HTTP/REST/JSON front-end to bleve☆121Updated 3 years ago
- A small library in golang, that detects the language of a text. (text categorization)☆153Updated last year
- mediawiki dump parser for loading up wikipedia data☆101Updated 11 months ago
- A Go implementation of the WordNet API☆39Updated 5 years ago
- 🔮 Use TensorFlow models in Go to evaluate Images (and more soon!)☆63Updated 6 years ago
- The gangsta gangsta way to pull email☆109Updated 4 years ago
- A Go package for working with headless Chrome. Run interactive JavaScript commands on web pages with Go and Chrome.☆120Updated 5 years ago
- Word Stemming in Go☆79Updated 6 years ago
- Chrome Automation Library using Google Chrome Remote Debugger API in Go☆85Updated 3 years ago
- A Go package for n-gram based text categorization, with support for utf-8 and raw text☆72Updated 3 years ago
- Utilities for working with discrete probability distributions and other tools useful for doing NLP work☆96Updated 13 years ago