markusmobius / go-trafilatura
go-trafilatura is a Go port of the trafilatura Python library.
☆61Updated 6 months ago
Alternatives and similar repositories for go-trafilatura
Users that are interested in go-trafilatura are comparing it to the libraries listed below
Sorting:
- Go-DomDistiller is a Go port of the DOM Distiller library which implements Reader mode in Chrome for Android and Desktop. It has no depen…☆68Updated 7 months ago
- Readability is a library written in Go (golang) to parse, analyze and convert HTML pages into readable content. Originally an Arc90 Exper…☆121Updated 2 years ago
- A Go package that implements the JusText boilerplate removal algorithm☆109Updated 2 years ago
- Go implementation of the SentencePiece tokenizer☆29Updated 8 months ago
- Go client for txtai☆79Updated 3 weeks ago
- A lemmatizer implemented in Go☆86Updated this week
- TextRank implementation in Golang with extendable features (summarization, phrase extraction) and multithreading (goroutine).☆212Updated 3 years ago
- go parser for human readable dates ported from the dateparser python package☆60Updated 2 weeks ago
- Go implementation of @qdrant/fastembed.☆73Updated 11 months ago
- Html Content / Article Extractor in Golang☆442Updated last year
- A high-performance Golang library for easily repairing invalid JSON documents. Designed to fix common JSON issues and optimize JSON conte…☆36Updated last week
- Go module for fetching embeddings from embeddings providers☆53Updated 3 weeks ago
- Pagser is a simple, extensible, configurable parse and deserialize html page to struct based on goquery and struct tags for golang crawle…☆107Updated last year
- grobotstxt is a native Go port of Google's robots.txt parser and matcher library.☆110Updated 3 years ago
- Google Search Results GoLang API☆69Updated last year
- Go bindings for HuggingFace Tokenizer☆135Updated last month
- Article spinning and spintax/spinning syntax engine written in Go, useful for A/B, testing pieces of text/articles and creating more natu…☆59Updated 4 years ago
- This is a simple Go package for interacting with the Replicate (https://replicate.com) HTTP APIs. Replicate is an API service that allows…☆26Updated last year
- Production grade LLM-ops in Golang☆55Updated last week
- Go implementation of today's most used tokenizers☆42Updated 4 years ago
- Generate OpenAPI 3.0 specifications from Go code.☆66Updated 8 months ago
- A high effective golang library for parsing big-sized sitemaps and avoiding high memory usage. The sitemap parser was written on golang w…☆38Updated 2 years ago
- panicwatch is a Go library for panic handling/reporting in Go applications. Inspired by mitchellh/panicwrap☆18Updated last week
- Simple Go package to convert HTML to plain text☆153Updated last year
- NLP transformers written in Go☆230Updated 2 years ago
- A lightweight buffered event lib☆59Updated 3 years ago
- Neural Language Model for Go☆59Updated last year
- SQLite FTS5-based search engine for Hugo pages☆35Updated 2 weeks ago
- NLP tokenizers written in Go language☆233Updated 5 months ago
- Support for reading and writing PDF files in Go.☆34Updated this week