ChrisCates / CommonCrawlerLinks
🕸 A simple way to extract data from Common Crawl
☆34Updated 5 years ago
Alternatives and similar repositories for CommonCrawler
Users that are interested in CommonCrawler are comparing it to the libraries listed below
Sorting:
- Text summarizer for golang using LexRank☆134Updated last month
- Removes most frequent words (stop words) from a text content. Based on a Curated list of language statistics.☆151Updated 2 years ago
- Maybe the tiniest HTTP proxy that also has a cache☆68Updated 3 years ago
- Miscellaneous tools for processing WARC files from the CommonCrawl☆24Updated 11 years ago
- Read and write WARC files in Go☆46Updated 7 years ago
- Public Zone Database☆259Updated this week
- Weighted PageRank implementation in Go☆87Updated 4 years ago
- package lingo provides the data structures and algorithms required for natural language processing☆158Updated 2 years ago
- distributed data sync with operational transformation/transforms☆88Updated 6 years ago
- 🔍 Lens is an opt-in search engine and data collection tool to aid content discovery of the distributed web☆61Updated 6 years ago
- Simple Email Parser☆47Updated 9 years ago
- A Go port of the Rapid Automatic Keyword Extraction algorithm (RAKE)☆121Updated 4 months ago
- An PDU implementation in Go☆48Updated 9 months ago
- Summarizes text☆39Updated 10 years ago
- TextRank implementation in Golang with extendable features (summarization, phrase extraction) and multithreading (goroutine).☆218Updated 4 months ago
- Datastore implementation using badger as backend.☆58Updated 2 months ago
- scrape google search results☆180Updated last year
- Go implementation of haadcode's orbit-db☆11Updated 8 years ago
- simple base58 codec☆20Updated 8 years ago
- Stream, filter and react to Twitter status updates on the command line☆13Updated 7 years ago
- Golang framework to build an AI that can understand and speak back to you, and everything else you want☆244Updated 5 years ago
- A golang library to work with WARC files from the common crawl☆15Updated 7 years ago
- A simple tool to collect and process quite a few web news from multiple sources☆35Updated 3 years ago
- Websocket implementation for fasthttp.☆54Updated last year
- PageRank implementation in Go☆100Updated last year
- Cross-platform persistent and distributed web crawler☆63Updated 6 years ago
- A Go SDK to make voice calls & send SMS using Plivo and to generate Plivo XML☆34Updated 2 weeks ago
- A RiveScript interpreter for Go. RiveScript is a scripting language for chatterbots.☆62Updated 2 years ago
- A Go package that implements the JusText boilerplate removal algorithm☆110Updated 3 years ago
- Instagram power tool☆57Updated 6 years ago