WARC (Web Archive) Input and Output Formats for Hadoop
☆37Dec 7, 2014Updated 11 years ago
Alternatives and similar repositories for warc-hadoop
Users that are interested in warc-hadoop are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A flexible pure-Java OCR implementation. Eventually.☆20Jan 2, 2015Updated 11 years ago
- A Python library to simplify batch requests to AWS Services☆12Apr 25, 2020Updated 5 years ago
- Rainfall is an extensible java framework to implement custom DSL based stress and performance tests☆12Mar 31, 2026Updated last week
- This is a TREC evaluation demonstration written for a lecture on information retrieval evaluation.☆24Feb 12, 2018Updated 8 years ago
- ☆19Feb 7, 2016Updated 10 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- ☆13Jul 2, 2025Updated 9 months ago
- Stanford CoreNLP NER addon for Apache Tika's NamerEntityParser☆13Feb 26, 2022Updated 4 years ago
- Generate a graph on Graph Commons from Instagram activity☆10Jan 25, 2016Updated 10 years ago
- IPython Notebook for Sentiment Classification☆10Nov 12, 2014Updated 11 years ago
- Readinglist client☆14Mar 12, 2015Updated 11 years ago
- A simple CDK app written in Kotlin using Gradle DSL☆12Dec 28, 2018Updated 7 years ago
- Simple app to manage currencies conversion in Django using openexchangerates.org service.☆10Nov 17, 2014Updated 11 years ago
- TREC Core track☆11Jul 5, 2017Updated 8 years ago
- An IRC client for the browser, written in AngularJS. Uses native sockets when packaged as a Chrome app (portable to other environments).☆29Jul 21, 2013Updated 12 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- bindings to some parts of opencv to lua+torch☆15Feb 14, 2013Updated 13 years ago
- pymur is a Python interface to The Lemur Toolkit.☆19Sep 17, 2018Updated 7 years ago
- A collection of demonstration languages in Lua/Terra suitable for learning or for forking when creating a new language☆11Aug 27, 2015Updated 10 years ago
- A reinforcement learning package implemented in Torch☆11Jan 24, 2016Updated 10 years ago
- Parse a URL assuming that it's http/https, even if protocol or // isn't present☆17Oct 25, 2025Updated 5 months ago
- Java library for object oriented exception handling☆17Jun 7, 2018Updated 7 years ago
- Common web archive utility code.☆63Apr 1, 2026Updated last week
- Concurrent and distributed Prolog via join patterns (join calculus)☆12Mar 10, 2015Updated 11 years ago
- Spring Cloud Data Flow Streaming Example☆10Mar 17, 2018Updated 8 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- Lambda Function to extract EXIF data from images uploaded to an S3 bucket and store it in DynamoDB.☆14Aug 17, 2018Updated 7 years ago
- A Common Lisp library for decompressing deflate, zlib, gzip, and bzip2 data☆18Jul 26, 2017Updated 8 years ago
- Application for ground-truthing semantic segmentation datasets in PyQt4/OpenCV.☆11Aug 15, 2017Updated 8 years ago
- Pure JAX-RS 2.0 ClientRequestFilter/WriterInterceptor used to sign AWS REST requests. Also has presign capabilities.☆15Jan 4, 2022Updated 4 years ago
- ✉️ A netlify lambda function that emails you tweets from a twitter list.☆16Mar 4, 2023Updated 3 years ago
- MySQL UDF executing Lua code with storage engine API☆19May 18, 2017Updated 8 years ago
- A Ruby implementation of Walker's Alias Method for quickly sampling from an array with a given probability distribution☆71Mar 16, 2013Updated 13 years ago
- Spark Custome Stream Source and Sink☆12Jan 19, 2019Updated 7 years ago
- Deep Learning (PyTorch) Models Deployment using SQL databases☆10Jul 25, 2021Updated 4 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- New nixnote is cloned on miurahr/nixnote2 ... Nixnote (formaly nevernote) is imcomplete evernote OSS cilent. here is a development branch…☆19Feb 16, 2013Updated 13 years ago
- Application simulating external APIs for the Practical Rx Workshop☆10May 16, 2015Updated 10 years ago
- JSON logging configuration for Spring Boot and ELK☆20May 6, 2017Updated 8 years ago
- Spring Data Aerospike☆36Jan 30, 2020Updated 6 years ago
- Command Line Tool to Help You Send Newsletters by Email☆14Apr 2, 2026Updated last week
- ☆24Jul 13, 2022Updated 3 years ago
- A TensorFlow 2.0 .whl file compiled with an old processor/computer☆11Dec 12, 2020Updated 5 years ago