ChrisCates / CommonCrawlerLinks
🕸 A simple way to extract data from Common Crawl
☆34Updated 5 years ago
Alternatives and similar repositories for CommonCrawler
Users that are interested in CommonCrawler are comparing it to the libraries listed below
Sorting:
- Text summarizer for golang using LexRank☆137Updated last month
- Miscellaneous tools for processing WARC files from the CommonCrawl☆24Updated 11 years ago
- package lingo provides the data structures and algorithms required for natural language processing☆158Updated 2 years ago
- Removes most frequent words (stop words) from a text content. Based on a Curated list of language statistics.☆152Updated 2 years ago
- Read and write WARC files in Go☆47Updated 7 years ago
- BitMaelum software suite☆62Updated 2 years ago
- Mozilla's Gecko Marionette client in golang☆56Updated 8 months ago
- 🔍 Lens is an opt-in search engine and data collection tool to aid content discovery of the distributed web☆61Updated 6 years ago
- Datastore implementation using badger as backend.☆58Updated 3 months ago
- A RiveScript interpreter for Go. RiveScript is a scripting language for chatterbots.☆62Updated 2 years ago
- PageRank implementation in Go☆101Updated last year
- A Go port of the Rapid Automatic Keyword Extraction algorithm (RAKE)☆122Updated 5 months ago
- Go port of secret-handshake☆45Updated 11 months ago
- Summarizes text☆39Updated 10 years ago
- Cross-platform persistent and distributed web crawler☆63Updated 6 years ago
- Maybe the tiniest HTTP proxy that also has a cache☆68Updated 3 years ago
- simple base58 codec☆20Updated 8 years ago
- TextRank implementation in Golang with extendable features (summarization, phrase extraction) and multithreading (goroutine).☆220Updated 5 months ago
- Livestreaming via IPFS☆23Updated 2 years ago
- Reusable Golang library to provide readability scores☆22Updated 4 years ago
- Zero knowledge push relay☆33Updated 6 years ago
- Additional functionality for Go's os package☆18Updated 8 months ago
- adding badger support to blevesearch☆63Updated 2 years ago
- A generic patricia trie (also called radix tree) implemented in Go (Golang)☆29Updated 6 years ago
- An easy-to-use, lightweight embedded on-disk database built on Badger for use in your Go programs.☆52Updated 5 years ago
- Simple Email Parser☆47Updated 9 years ago
- Weighted PageRank implementation in Go☆87Updated 4 years ago
- A simple, lightweight, embedded geocoder for Golang with city level accuracy☆73Updated 10 years ago
- Middleware for keeping track of users, login states and permissions☆88Updated 2 months ago
- Websites scanner for X-Recruiting header☆20Updated 7 years ago