ChrisCates / CommonCrawlerLinks
πΈ A simple way to extract data from Common Crawl
β34Updated 5 years ago
Alternatives and similar repositories for CommonCrawler
Users that are interested in CommonCrawler are comparing it to the libraries listed below
Sorting:
- π Lens is an opt-in search engine and data collection tool to aid content discovery of the distributed webβ61Updated 6 years ago
- Read and write WARC files in Goβ46Updated 7 years ago
- TextRank implementation in Golang with extendable features (summarization, phrase extraction) and multithreading (goroutine).β218Updated 3 months ago
- Stream, filter and react to Twitter status updates on the command lineβ13Updated 7 years ago
- Removes most frequent words (stop words) from a text content. Based on a Curated list of language statistics.β149Updated 2 years ago
- Datastore implementation using badger as backend.β58Updated last month
- Multihash implementation in Goβ241Updated last month
- Text summarizer for golang using LexRankβ133Updated last year
- package lingo provides the data structures and algorithms required for natural language processingβ156Updated 2 years ago
- BitMaelum software suiteβ62Updated 2 years ago
- Nifty library to manage, query and store RDF triples. Make RDF great again!β115Updated 6 years ago
- Simple Email Parserβ47Updated 9 years ago
- libp2p WebRTC transport in Go that includes a discovery mechanism provided by the signalling-starβ27Updated 6 years ago
- Miscellaneous tools for processing WARC files from the CommonCrawlβ24Updated 11 years ago
- goprocess - like Context, but with good close semantics.β73Updated 5 years ago
- A Go port of the Rapid Automatic Keyword Extraction algorithm (RAKE)β121Updated 3 months ago
- adding badger support to blevesearchβ63Updated 2 years ago
- Summarizes textβ39Updated 10 years ago
- Gonudb is an append-only key/value datastore written in Go.β19Updated last year
- Implementation of a unix-like filesystem on top of an ipld merkledagβ105Updated 2 years ago
- A personal knowledge base, an experimental project under development.β128Updated 4 months ago
- Fabric is a simple triplestore written in Golangβ198Updated 2 years ago
- BlackholeDB is a simple distributed key-value DB based on IPFS protocol.β124Updated last month
- Zero knowledge push relayβ33Updated 6 years ago
- Go implementation of haadcode's orbit-dbβ11Updated 8 years ago
- Shamir's Secret Sharing Algorithm implementation in golang combined with PGP and a mail delivery systemβ35Updated 7 years ago
- NAT port mapping library for go-libp2pβ62Updated 3 years ago
- HTTP on top of libp2pβ65Updated last month
- A fast URL parser for Goβ40Updated 2 years ago
- Scrapping script used to test the Spanish web archive and redirects system, with more than 10,000 pages. It checks redirections, http resβ¦β46Updated 4 years ago