ChrisCates / CommonCrawlerLinks
🕸 A simple way to extract data from Common Crawl
☆34Updated 5 years ago
Alternatives and similar repositories for CommonCrawler
Users that are interested in CommonCrawler are comparing it to the libraries listed below
Sorting:
- 🔍 Lens is an opt-in search engine and data collection tool to aid content discovery of the distributed web☆61Updated 6 years ago
- Read and write WARC files in Go☆46Updated 7 years ago
- Removes most frequent words (stop words) from a text content. Based on a Curated list of language statistics.☆151Updated 2 years ago
- Text summarizer for golang using LexRank☆134Updated 2 weeks ago
- Miscellaneous tools for processing WARC files from the CommonCrawl☆24Updated 11 years ago
- Datastore implementation using badger as backend.☆58Updated 2 months ago
- A personal knowledge base, an experimental project under development.☆129Updated 5 months ago
- Summarizes text☆39Updated 10 years ago
- package lingo provides the data structures and algorithms required for natural language processing☆156Updated 2 years ago
- PageRank implementation in Go☆99Updated last year
- Go port of secret-handshake☆45Updated 10 months ago
- Simple Email Parser☆47Updated 9 years ago
- Go implementation of haadcode's orbit-db☆11Updated 8 years ago
- An example app providing an HTTP/REST/JSON front-end to bleve☆133Updated 7 months ago
- adding badger support to blevesearch☆63Updated 2 years ago
- Gonudb is an append-only key/value datastore written in Go.☆19Updated last year
- simple base58 codec☆20Updated 8 years ago
- Gorjun is a golang replacement for Kurjun project.☆19Updated 6 years ago
- A Go package for n-gram based text categorization, with support for utf-8 and raw text☆72Updated 10 months ago
- A Go SDK to make voice calls & send SMS using Plivo and to generate Plivo XML☆34Updated this week
- example bleve application for indexing and search beers and breweries☆91Updated 7 months ago
- Maybe the tiniest HTTP proxy that also has a cache☆68Updated 3 years ago
- A RiveScript interpreter for Go. RiveScript is a scripting language for chatterbots.☆62Updated 2 years ago
- A golang library to work with WARC files from the common crawl☆15Updated 7 years ago
- Realtime data exchange platform for Smart Cities☆26Updated 5 years ago
- web-based UI editor for bleve index mappings☆23Updated 6 months ago
- A generic patricia trie (also called radix tree) implemented in Go (Golang)☆29Updated 6 years ago
- HTTP on top of libp2p☆67Updated 2 months ago
- TextRank implementation in Golang with extendable features (summarization, phrase extraction) and multithreading (goroutine).☆218Updated 4 months ago
- Weighted PageRank implementation in Go☆87Updated 4 years ago