markusmobius / go-trafilatura
go-trafilatura is a Go port of the trafilatura Python library.
☆59Updated 5 months ago
Alternatives and similar repositories for go-trafilatura:
Users that are interested in go-trafilatura are comparing it to the libraries listed below
- Go-DomDistiller is a Go port of the DOM Distiller library which implements Reader mode in Chrome for Android and Desktop. It has no depen…☆68Updated 6 months ago
- Go implementation of @qdrant/fastembed.☆70Updated 10 months ago
- A lemmatizer implemented in Go☆86Updated 2 weeks ago
- Go client for txtai☆79Updated last week
- Readability is a library written in Go (golang) to parse, analyze and convert HTML pages into readable content. Originally an Arc90 Exper…☆121Updated 2 years ago
- TextRank implementation in Golang with extendable features (summarization, phrase extraction) and multithreading (goroutine).☆212Updated 3 years ago
- Go module for fetching embeddings from embeddings providers☆53Updated 2 weeks ago
- Pagser is a simple, extensible, configurable parse and deserialize html page to struct based on goquery and struct tags for golang crawle…☆107Updated last year
- A Go package that implements the JusText boilerplate removal algorithm☆109Updated 2 years ago
- Go bindings for HuggingFace Tokenizer☆134Updated 3 weeks ago
- NLP tokenizers written in Go language☆229Updated 4 months ago
- Unofficial (Golang) Go bindings for the Hugging Face Inference API☆62Updated 2 weeks ago
- Cybertron: the home planet of the Transformers in Go☆304Updated 10 months ago
- A lightweight buffered event lib☆58Updated 3 years ago
- A high-performance Golang library for easily repairing invalid JSON documents. Designed to fix common JSON issues and optimize JSON conte…☆34Updated 2 weeks ago
- A Go port of the Rapid Automatic Keyword Extraction algorithm (RAKE)☆120Updated 3 months ago
- Go implementation of the SentencePiece tokenizer☆28Updated 7 months ago
- Letters, or how to parse emails in Go☆70Updated 2 weeks ago
- Pure Go implementation of OpenAI's tiktoken tokenizer☆349Updated 3 weeks ago
- go parser for human readable dates ported from the dateparser python package☆59Updated 9 months ago
- A Go implementation of the Thumbhash image placeholder generation algorithm.☆83Updated 9 months ago
- This is a Golang open-source module that makes it easy to access and parse data from Wikipedia (Wikipedia API wrapper)☆101Updated last year
- Html Content / Article Extractor in Golang☆442Updated last year
- panicwatch is a Go library for panic handling/reporting in Go applications. Inspired by mitchellh/panicwrap☆18Updated 2 weeks ago
- Extremely Fast Full-Text-Search Algorithm and Caching System☆157Updated last year
- Write Python in Go - The most intuitive Python wrapper for Golang☆38Updated 4 months ago
- Restful, in-memory, full-text search engine☆34Updated 4 months ago
- Natural Language Processing Toolkit in Golang☆64Updated 4 years ago
- SQLite FTS5-based search engine for Hugo pages☆35Updated this week
- Go Based Lightweight RAG / LLM Tool with CLI + API☆14Updated last year