Simplified version of a common crawl fetcher
☆16Dec 24, 2025Updated 5 months ago
Alternatives and similar repositories for commoncrawl-fetcher-lite
Users that are interested in commoncrawl-fetcher-lite are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A DropWizard wrapper around Apache Tika.☆10Dec 22, 2016Updated 9 years ago
- Single server/laptop grade file-observatory☆10Mar 30, 2023Updated 3 years ago
- File-tests is test-suite for File tool. Previous home: https://fedorahosted.org/file-tests/☆21Jun 3, 2026Updated last week
- Continuous build system used by Mono and Moonlight.☆34Apr 8, 2020Updated 6 years ago
- Software in this repository is not maintained anymore☆11Jul 6, 2022Updated 3 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Synapse Rapid Power-up for SinkDB☆11Jun 24, 2025Updated 11 months ago
- Efficient Message Digest for MXF Files☆10Jul 6, 2020Updated 5 years ago
- Miscellaneous small bits and bobs.☆11Sep 8, 2025Updated 9 months ago
- A small tool which uses the CommonCrawl URL Index to download documents with certain file types or mime-types. This is used for mass-test…☆74Jun 7, 2026Updated last week
- An open source route planning library and server using OpenStreetMap.☆13May 26, 2026Updated 3 weeks ago
- Efficient indexing and retrieval of OCR bounding boxes in Solr☆22Mar 13, 2019Updated 7 years ago
- Automatically spider the result set of a Censys/Shodan search and download all files where the file name or folder path matches a regex.☆29Apr 22, 2023Updated 3 years ago
- Solving CAPTCHA with Image Classification☆10Mar 13, 2025Updated last year
- Tools for preservation of floppy disks☆15Mar 25, 2026Updated 2 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Open-source web application to keep track of all data processing activities prefigured by GDPR Article 30 "Records of processing activiti…☆24Apr 21, 2023Updated 3 years ago
- PetaTest is tiny but powerful, embeddable, dependency free Unit Testing framework for .NET and Mono.☆13Jul 23, 2018Updated 7 years ago
- This repository tracks the changes the the "Unix Timesharing System" paper written by Dennis Ritchie and Ken Thompson.☆11Oct 6, 2018Updated 7 years ago
- A script to automate the creation of cloud infrastructure for hash cracking.☆15Sep 4, 2019Updated 6 years ago
- A museum of historical and modern regular expression engines, showing their development and influence☆25Apr 26, 2026Updated last month
- Package software with ease 📦 Versatile deb, rpm and apk packager fueled by PKGBUILD specfiles and golang☆13Mar 4, 2024Updated 2 years ago
- Academy Spectral Similarity Index Calculator☆10Aug 7, 2024Updated last year
- ☆10Apr 20, 2015Updated 11 years ago
- ATNwalk is a grammar-based input generator for fuzzing and other evolutionary algorithms. It relies on binary-level mutations to bit sequ…☆11Dec 10, 2024Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- W3C Validators in Java☆21Updated this week
- IEEE Computer Society Keywords to Organize Knowledge☆13Jan 13, 2020Updated 6 years ago
- ☆30Mar 3, 2021Updated 5 years ago
- NGINX Router Mesh Network Architecture for Microservices☆19May 24, 2023Updated 3 years ago
- fundamental traits to describe an architecture in the yaxpeax project☆17Mar 1, 2025Updated last year
- LLAP is an LLVM-based tool for generating enriched program dependency graphs (ePDGs) from program source code that are suitable for use i…☆16May 17, 2023Updated 3 years ago
- Gradle plugin for Classycle dependency analyzer☆13Dec 18, 2017Updated 8 years ago
- A small tool to easily mount APFS image on macOS for forensics.☆17Jul 30, 2020Updated 5 years ago
- Constant-time choose between two variables in Clang/LLVM☆21Apr 14, 2018Updated 8 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆12Oct 24, 2015Updated 10 years ago
- A way to sync and configure your Suunto watch on Linux☆14Feb 10, 2022Updated 4 years ago
- AI-powered terminal session logger and analyzer. Save a summary of each session and query for it within a catalog 📟🤖☆16Aug 5, 2024Updated last year
- Simple Docker container to run a Tor node.☆15Jun 5, 2026Updated last week
- Python binding for NuSMV.☆11Nov 29, 2017Updated 8 years ago
- Plugins for QSYS☆11May 27, 2026Updated 2 weeks ago
- Visio .vsdx parser based on POI (merged into POI as of 3.14)☆16Dec 8, 2015Updated 10 years ago