WARC (Web Archive) Input and Output Formats for Hadoop
☆38Dec 7, 2014Updated 11 years ago
Alternatives and similar repositories for warc-hadoop
Users that are interested in warc-hadoop are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A Python library to simplify batch requests to AWS Services☆12Apr 25, 2020Updated 6 years ago
- Rainfall is an extensible java framework to implement custom DSL based stress and performance tests☆12Mar 31, 2026Updated 3 months ago
- This is a TREC evaluation demonstration written for a lecture on information retrieval evaluation.☆24Feb 12, 2018Updated 8 years ago
- IPython Notebook for Sentiment Classification☆10Nov 12, 2014Updated 11 years ago
- A set of reusable Java components that implement functionality common to any web crawler☆259Jun 3, 2026Updated 3 weeks ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Simple app to manage currencies conversion in Django using openexchangerates.org service.☆10Nov 17, 2014Updated 11 years ago
- bindings to some parts of opencv to lua+torch☆15Feb 14, 2013Updated 13 years ago
- A collection of demonstration languages in Lua/Terra suitable for learning or for forking when creating a new language☆11Aug 27, 2015Updated 10 years ago
- AI-powered YouTube video analysis toolkit using MCP. Extract transcripts, generate knowledge graphs, generate high-quality detailed note…☆18Jul 5, 2025Updated 11 months ago
- Django connection app for musicbrainz database☆14Sep 13, 2014Updated 11 years ago
- Spring Cloud Data Flow Streaming Example☆10Mar 17, 2018Updated 8 years ago
- Allows you to convert csv formatted LastPass/KeePass 2.0 export to a KeePass 1.0 XML import.☆12Feb 15, 2015Updated 11 years ago
- Lambda Function to extract EXIF data from images uploaded to an S3 bucket and store it in DynamoDB.☆15Aug 17, 2018Updated 7 years ago
- Evaluation Kit of Joint Recovery of Dense Correspondence and Cosegmentation in Two Images (CVPR 2016)☆12Apr 25, 2018Updated 8 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- A Common Lisp library for decompressing deflate, zlib, gzip, and bzip2 data☆18Jul 26, 2017Updated 8 years ago
- Application for ground-truthing semantic segmentation datasets in PyQt4/OpenCV.☆11Aug 15, 2017Updated 8 years ago
- Hi Spring fans! Welcome to another super short mid-season interregnum installment of Spring Tips in which I introduce a *super* prelimina…☆12Mar 21, 2019Updated 7 years ago
- Arteria is a high performance message channel system for IPC and network communication☆12Jun 21, 2017Updated 9 years ago
- MySQL UDF executing Lua code with storage engine API☆19May 18, 2017Updated 9 years ago
- [FFCV-PL] manage fast data loading with ffcv and pytorch lightning☆16Jul 17, 2023Updated 2 years ago
- The Architecture of Open Source Applications☆13Nov 24, 2013Updated 12 years ago
- Ecore meta-model and examples☆15Sep 19, 2017Updated 8 years ago
- Spark Custome Stream Source and Sink☆12Jan 19, 2019Updated 7 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- apache shiro with hibernate example project☆11Mar 4, 2014Updated 12 years ago
- New nixnote is cloned on miurahr/nixnote2 ... Nixnote (formaly nevernote) is imcomplete evernote OSS cilent. here is a development branch…☆19Feb 16, 2013Updated 13 years ago
- Application simulating external APIs for the Practical Rx Workshop☆10May 16, 2015Updated 11 years ago
- admin-ui-boshrelease☆17Mar 16, 2017Updated 9 years ago
- S1P demo for the power of Reactor Netty and Reactor Kafka in order to build Reactive System☆13May 28, 2019Updated 7 years ago
- This is the source code accompanying my blog post explaining the upside of using pure functions in Java.☆11Nov 5, 2020Updated 5 years ago
- JSON logging configuration for Spring Boot and ELK☆20May 6, 2017Updated 9 years ago
- A web interface for humans to interact with Beads - the issue tracker made for agents https://github.com/steveyegge/beads☆28Oct 16, 2025Updated 8 months ago
- Command Line Tool to Help You Send Newsletters by Email☆13Jun 23, 2026Updated last week
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆24Jul 13, 2022Updated 3 years ago
- A TensorFlow 2.0 .whl file compiled with an old processor/computer☆11Dec 12, 2020Updated 5 years ago
- A project to apply a traditional implementation of Slurm on Kubernetes (with some magic)☆11Dec 20, 2017Updated 8 years ago
- Overlays kml data on a user specified image☆127May 10, 2015Updated 11 years ago
- Blazing fast HDF5 Image Generator for Keras☆12Jul 11, 2020Updated 5 years ago
- ☆16Aug 8, 2014Updated 11 years ago
- geo location util☆15Nov 20, 2017Updated 8 years ago