Fureteur is a simple, configurable, fault-tolerant web crawler written is Scala
☆28Oct 14, 2014Updated 11 years ago
Alternatives and similar repositories for fureteur
Users that are interested in fureteur are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A library of examples showing how to use the Common Crawl corpus (2008-2012, ARC format)☆65Aug 5, 2016Updated 9 years ago
- diff large files without running out of memory; only unified format; probably buggy, but ~no memory usage☆14Mar 6, 2014Updated 12 years ago
- opentracing for pure applications☆17Jan 8, 2019Updated 7 years ago
- Spring Design Patterns and Best Practices [video], published by Packt☆13Jan 30, 2023Updated 3 years ago
- An HTTP server packaged with postgresql, jaegar-all-in-one, and perf-test to record ad deliveries, clicks, and installs, and query the st…☆10Sep 16, 2021Updated 4 years ago
- Example Chrome extension built with Angular and the Yeoman Chrome extension generator that modifies the Google search page with a 'Search…☆15Mar 8, 2014Updated 12 years ago
- Proof of concept for Starship rewrite☆10Mar 4, 2026Updated 3 weeks ago
- Zaiste, these are awesome dotfiles.☆11May 18, 2023Updated 2 years ago
- tool for postgres to automatically build rest services and web forms☆13Updated this week
- Scala implementations of standard algorithms for Multi-Armed Bandits Problem.☆12May 7, 2016Updated 9 years ago
- Crawl-Anywhere - Web Crawler and document processing pipeline with Solr integration.☆98Jul 1, 2017Updated 8 years ago
- Java implementation of Online LDA Algorithm☆15Dec 24, 2013Updated 12 years ago
- Autoproxy automatically detects proxies and stores them in the respective environment variables (e.g. http_proxy).☆13Oct 2, 2016Updated 9 years ago
- Open Pi Phone project☆15Oct 7, 2014Updated 11 years ago
- Instantly turn your data into charts and dashboards. It's like a mini Tableau.☆27Jan 19, 2023Updated 3 years ago
- ☆10Feb 26, 2019Updated 7 years ago
- Scala DSL for web crawling☆149Aug 2, 2016Updated 9 years ago
- Scala helpers for Dropwizard.☆86Aug 16, 2016Updated 9 years ago
- A free multithreaded proxy checking program written in Java. Load a proxy list and check each proxy to verify it's alive to create a new …☆11Nov 5, 2015Updated 10 years ago
- Web page content extractor☆31Feb 26, 2013Updated 13 years ago
- ☆14Aug 3, 2020Updated 5 years ago
- Scala utilities for teaching computational linguistics and prototyping algorithms.☆42Dec 29, 2012Updated 13 years ago
- Standalone JavaScript client for websocket-rails.☆10Apr 7, 2015Updated 10 years ago
- Real-time, collaborative, threat modeling tool. / Un outil collaboratif de modélisation des menaces en temps réel.☆15Updated this week
- A node.js command line app that monitors Java processes via JMX☆13Jan 28, 2015Updated 11 years ago
- Web scraper for Scala☆37Mar 11, 2013Updated 13 years ago
- Kairos, combines a focused crawler and an information extraction engine, to convert a list of conference websites into a index filled wit…☆18Feb 20, 2011Updated 15 years ago
- Automatic CAPTCHA decoding☆11Apr 17, 2012Updated 13 years ago
- Spring Boot Web with Hessian☆11Jul 2, 2014Updated 11 years ago
- A Nutch 2.2.1 plugin which allows users to shuffle off the responsibility for retrieving pages to a selenium hub/node spoke system. This …☆16Jun 9, 2016Updated 9 years ago
- Windows Live API binding and connect support.☆18Dec 1, 2024Updated last year
- 基于搜索引擎实现网盘搜索☆12Nov 15, 2018Updated 7 years ago
- A PCSC binding for NodeJS☆27May 12, 2016Updated 9 years ago
- MOVED TO: github.com/akka/akka-persistence-dynamodb☆24Jan 30, 2016Updated 10 years ago
- Functional HTML5 and XML library for the Scala platform☆110Dec 10, 2020Updated 5 years ago
- just a basic rootkit for learning how to playing sys_call_table☆16Sep 12, 2016Updated 9 years ago
- port.js is an expanded version of Michael Gundlach’s Chrome-(and-Opera!)–to–Safari porting library for extensions <https://adblockforchro…☆61Aug 27, 2013Updated 12 years ago
- Image IO read and write☆11Oct 19, 2016Updated 9 years ago
- java分布式爬虫,主机和从机控制的机制☆14May 21, 2015Updated 10 years ago