WARC (Web Archive) Input and Output Formats for Hadoop
☆37Dec 7, 2014Updated 11 years ago
Alternatives and similar repositories for warc-hadoop
Users that are interested in warc-hadoop are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A flexible pure-Java OCR implementation. Eventually.☆20Jan 2, 2015Updated 11 years ago
- A Python library to simplify batch requests to AWS Services☆12Apr 25, 2020Updated 6 years ago
- This is a TREC evaluation demonstration written for a lecture on information retrieval evaluation.☆24Feb 12, 2018Updated 8 years ago
- ☆19Feb 7, 2016Updated 10 years ago
- Stanford CoreNLP NER addon for Apache Tika's NamerEntityParser☆13Feb 26, 2022Updated 4 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Generate a graph on Graph Commons from Instagram activity☆10Jan 25, 2016Updated 10 years ago
- IPython Notebook for Sentiment Classification☆10Nov 12, 2014Updated 11 years ago
- Readinglist client☆14Mar 12, 2015Updated 11 years ago
- TREC Core track☆11Jul 5, 2017Updated 8 years ago
- pymur is a Python interface to The Lemur Toolkit.☆19Sep 17, 2018Updated 7 years ago
- A collection of demonstration languages in Lua/Terra suitable for learning or for forking when creating a new language☆11Aug 27, 2015Updated 10 years ago
- Parse a URL assuming that it's http/https, even if protocol or // isn't present☆17Oct 25, 2025Updated 6 months ago
- AI-powered YouTube video analysis toolkit using MCP. Extract transcripts, generate knowledge graphs, generate high-quality detailed note…☆14Jul 5, 2025Updated 9 months ago
- bash loop to run tasks in the background. used as an anacron alternative☆13Nov 12, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Java library for object oriented exception handling☆17Jun 7, 2018Updated 7 years ago
- Concurrent and distributed Prolog via join patterns (join calculus)☆12Mar 10, 2015Updated 11 years ago
- Spring Cloud Data Flow Streaming Example☆10Mar 17, 2018Updated 8 years ago
- Allows you to convert csv formatted LastPass/KeePass 2.0 export to a KeePass 1.0 XML import.☆12Feb 15, 2015Updated 11 years ago
- Evaluation Kit of Joint Recovery of Dense Correspondence and Cosegmentation in Two Images (CVPR 2016)☆12Apr 25, 2018Updated 8 years ago
- Application for ground-truthing semantic segmentation datasets in PyQt4/OpenCV.☆11Aug 15, 2017Updated 8 years ago
- Hi Spring fans! Welcome to another super short mid-season interregnum installment of Spring Tips in which I introduce a *super* prelimina…☆12Mar 21, 2019Updated 7 years ago
- Example source for MongoDB / JavaScript snippets☆27Mar 11, 2013Updated 13 years ago
- [FFCV-PL] manage fast data loading with ffcv and pytorch lightning☆16Jul 17, 2023Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- The Architecture of Open Source Applications☆13Nov 24, 2013Updated 12 years ago
- Spark Custome Stream Source and Sink☆12Jan 19, 2019Updated 7 years ago
- (Deprecated)☆17Jan 9, 2016Updated 10 years ago
- apache shiro with hibernate example project☆11Mar 4, 2014Updated 12 years ago
- TREC Real-Time Summarization Tools☆15Jul 19, 2017Updated 8 years ago
- New nixnote is cloned on miurahr/nixnote2 ... Nixnote (formaly nevernote) is imcomplete evernote OSS cilent. here is a development branch…☆19Feb 16, 2013Updated 13 years ago
- Application simulating external APIs for the Practical Rx Workshop☆10May 16, 2015Updated 10 years ago
- S1P demo for the power of Reactor Netty and Reactor Kafka in order to build Reactive System☆13May 28, 2019Updated 6 years ago
- TWS Market Data Adapter☆20May 10, 2018Updated 7 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- A web interface for humans to interact with Beads - the issue tracker made for agents https://github.com/steveyegge/beads☆25Oct 16, 2025Updated 6 months ago
- A TensorFlow 2.0 .whl file compiled with an old processor/computer☆11Dec 12, 2020Updated 5 years ago
- A project to apply a traditional implementation of Slurm on Kubernetes (with some magic)☆11Dec 20, 2017Updated 8 years ago
- Pairwise Controlled Manifold Approximation (PaCMAP) for dimensionality reduction☆20Feb 3, 2026Updated 2 months ago
- Overlays kml data on a user specified image☆128May 10, 2015Updated 10 years ago
- ☆16Aug 8, 2014Updated 11 years ago
- Load WARC files into Apache Spark with sparklyr☆12Jan 11, 2022Updated 4 years ago