WARC (Web Archive) Input and Output Formats for Hadoop
☆37Dec 7, 2014Updated 11 years ago
Alternatives and similar repositories for warc-hadoop
Users that are interested in warc-hadoop are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Rainfall is an extensible java framework to implement custom DSL based stress and performance tests☆12Mar 31, 2026Updated 2 months ago
- This is a TREC evaluation demonstration written for a lecture on information retrieval evaluation.☆24Feb 12, 2018Updated 8 years ago
- TACOTRON: TOWARDS END-TO-END SPEECH SYNTHESIS☆16Sep 26, 2017Updated 8 years ago
- IPython Notebook for Sentiment Classification☆10Nov 12, 2014Updated 11 years ago
- A set of reusable Java components that implement functionality common to any web crawler☆258Jun 3, 2026Updated last week
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- TREC Core track☆11Jul 5, 2017Updated 8 years ago
- Launch AWS Elastic MapReduce jobs that process Common Crawl data.☆49Feb 15, 2017Updated 9 years ago
- The shared memory version of the Alternating Directions Implicit Solver for Isogeometric Analysis☆10Jan 26, 2019Updated 7 years ago
- bash loop to run tasks in the background. used as an anacron alternative☆13Nov 12, 2024Updated last year
- Spring Cloud Data Flow Streaming Example☆10Mar 17, 2018Updated 8 years ago
- Example source for MongoDB / JavaScript snippets☆27Mar 11, 2013Updated 13 years ago
- MySQL UDF executing Lua code with storage engine API☆19May 18, 2017Updated 9 years ago
- [FFCV-PL] manage fast data loading with ffcv and pytorch lightning☆16Jul 17, 2023Updated 2 years ago
- Spark Custome Stream Source and Sink☆12Jan 19, 2019Updated 7 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Warcbase is an open-source platform for managing analyzing web archives☆162Dec 8, 2017Updated 8 years ago
- TREC Real-Time Summarization Tools☆15Jul 19, 2017Updated 8 years ago
- Application simulating external APIs for the Practical Rx Workshop☆10May 16, 2015Updated 11 years ago
- S1P demo for the power of Reactor Netty and Reactor Kafka in order to build Reactive System☆13May 28, 2019Updated 7 years ago
- This is the source code accompanying my blog post explaining the upside of using pure functions in Java.☆11Nov 5, 2020Updated 5 years ago
- Spring Data Aerospike☆36Jan 30, 2020Updated 6 years ago
- Docker Container for grab-site☆13Aug 26, 2024Updated last year
- A web interface for humans to interact with Beads - the issue tracker made for agents https://github.com/steveyegge/beads☆28Oct 16, 2025Updated 7 months ago
- ☆16Aug 8, 2014Updated 11 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- geo location util☆15Nov 20, 2017Updated 8 years ago
- Hadoop tools for manipulating ClueWeb collections☆26Jul 15, 2016Updated 9 years ago
- ☆11Nov 18, 2022Updated 3 years ago
- Papirus e-ink display for Direwolf TNC and Pi Zero☆14Jul 31, 2023Updated 2 years ago
- Example Proteus Project☆11May 27, 2020Updated 6 years ago
- Open Source/Service libraries, examples, and experiments.☆42Jul 13, 2009Updated 16 years ago
- Implement functions to split strings☆13May 12, 2017Updated 9 years ago
- A short ScalaCheck tutorial for the Programming Principles course☆14Oct 4, 2021Updated 4 years ago
- A Prolog Implementation(Internal DSL, External DSL, REPL) in Scala.☆30Feb 19, 2010Updated 16 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Pairwise Controlled Manifold Approximation (PaCMAP) for dimensionality reduction☆20Feb 3, 2026Updated 4 months ago
- Kubernetes workshop☆16Sep 26, 2018Updated 7 years ago
- Fast optimizing Brainfuck interpreter in pure python☆14Nov 8, 2025Updated 7 months ago
- Study dotty source code using org-mode☆10Jan 10, 2017Updated 9 years ago
- ☆15Jul 10, 2018Updated 7 years ago
- Programming Assignments from coursera courses :)☆14Sep 22, 2013Updated 12 years ago
- Uses a genetic algorithm to "evolve" brainfuck programs with desirable behaviours☆12May 13, 2026Updated 3 weeks ago