ChrisCates / CommonCrawler
🕸 A simple way to extract data from Common Crawl
☆33Updated 4 years ago
Related projects ⓘ
Alternatives and complementary repositories for CommonCrawler
- Summarizes text☆38Updated 9 years ago
- Simple Go library for executing lots of operations spread over any number of threads☆73Updated last year
- web-based UI editor for bleve index mappings☆24Updated this week
- Facebook fastText database in SQLite with Go API☆32Updated 4 years ago
- doc2vec , word2vec, implemented by golang. word embedding representation☆41Updated 6 years ago
- Go Stanford NLP POS Tagger wrapper☆38Updated 7 years ago
- Search any text-based document☆23Updated 4 years ago
- jsgo playground: edit and run Go code in the browser, supporting arbitrary import paths☆50Updated 4 years ago
- 🔍 Lens is an opt-in search engine and data collection tool to aid content discovery of the distributed web☆61Updated 5 years ago
- A Go package that implements the JusText boilerplate removal algorithm☆102Updated 2 years ago
- Go wrapper of libutp reference uTP C implementation☆90Updated 6 months ago
- Named Entity Recognition for golang via MITIE☆33Updated 6 years ago
- Fast identification of character sequences in text or documents (multi-lingual)☆18Updated 8 years ago
- 🍱 bento is an English-based automation language designed to be used by non-technical people.☆32Updated 5 years ago
- runs go generate recursively on a specified path or environment variable and can filter by regex☆30Updated 7 years ago
- Very fast one-time string searches in Go. Simple and composable.☆12Updated 4 years ago
- A simple, lightweight, embedded geocoder for Golang with city level accuracy☆72Updated 9 years ago
- adding badger support to blevesearch☆62Updated last year
- Snowball stemmer for Go☆46Updated last year
- CLD2 (Compact Language Detector 2) bindings for Go (golang)☆38Updated 4 years ago
- Golang WARC (Web ARChive) Library☆29Updated 5 years ago
- Websocket implementation for fasthttp.☆53Updated 5 months ago
- A simple tool to collect and process quite a few web news from multiple sources☆34Updated 2 years ago
- A distributed forward caching proxy for Go's http.Client supporting TLS☆31Updated 6 years ago
- TextRank implementation in Golang with extendable features (summarization, phrase extraction) and multithreading (goroutine).☆205Updated 3 years ago
- An easy-to-use, lightweight embedded on-disk database built on Badger for use in your Go programs.☆52Updated 4 years ago
- Structured scraper for Go☆25Updated 6 years ago
- Bleve Extensions☆47Updated 7 months ago
- Go implementation of today's most used tokenizers☆39Updated 3 years ago
- Golang legofy which legolize images look as if they are made out of 1x1 LEGO blocks☆29Updated 5 years ago