tonywangcn / distributed-web-crawlerLinks
The Architecture of a Web Crawler: Building a Google-Inspired Distributed Web Crawler
☆125Updated last year
Alternatives and similar repositories for distributed-web-crawler
Users that are interested in distributed-web-crawler are comparing it to the libraries listed below
Sorting:
- 27.6% of the Top 10 Million Sites are Dead☆110Updated last year
- Airbnb scraper made in Go☆39Updated 7 months ago
- A simple ChatGPT clone built using Go☆39Updated 2 years ago
- CLI utility to scrape emails from websites☆171Updated 2 months ago
- Golang Crawling and scraping framework☆190Updated last month
- estela, an elastic web scraping cluster 🕸☆194Updated 3 weeks ago
- New way for collect information from the API's/Websites☆122Updated 2 weeks ago
- rotating open proxy multiplexer☆193Updated 3 weeks ago
- Golinkedin is a library written in pure golang for scraping Linkedin☆43Updated last year
- GoScrapy: Harnessing Go's power for blazingly fast web scraping, inspired by Python's Scrapy framework.☆104Updated 2 months ago
- Get structured JSON data from any page.☆178Updated 2 years ago
- Agency: Robust LLM Agent Management with Go☆71Updated last year
- The Web Scraping Club Free Repository☆158Updated 3 months ago
- Chew is a Go library for processing various content types into markdown/plaintext.☆42Updated 11 months ago
- Spider ported to Python☆103Updated 3 weeks ago
- Conveyor CI is a headless, cloud-native CI/CD orchestration engine.☆77Updated this week
- [deprecated] AI Gateway - core infrastructure stack for building production-ready AI Applications☆160Updated last year
- Reverse Engineered Twitter's API☆81Updated 2 years ago
- Amazon crawler made in Go☆41Updated 10 months ago
- go-trafilatura is a Go port of the trafilatura Python library.☆122Updated 4 months ago
- microservices for you☆143Updated 11 months ago
- Private ChatGPT/Perplexity. Securely unlocks knowledge from confidential business information.☆77Updated last year
- Durable execution in Go with the Golang Inngest SDK. Write durable functions in your existing app.☆88Updated last week
- Go library for scraping or downloading files bypassing Cloudflare protection and browser checks☆32Updated 4 years ago
- A low-code data extractor for websites with built in proxy and parsing capabilities. Great for testing and debugging css selectors☆196Updated last year
- A tutorial for web scraping using Playwright headless browser☆144Updated 4 months ago
- JotBot generates the missing code documentation for your Go and TypeScript projects. Powered by AI.☆37Updated last year
- Build super simple end-to-end data & ETL pipelines for your vector databases and Generative AI applications☆106Updated last year
- Production-ready, Light, and Flexible Webhook Infrastructure | Effortlessly Build Performant Webhook Integrations☆12Updated last year
- Unofficial Google Trends API for Go☆87Updated 3 years ago