Support for writing WARC files with Scrapy
☆24Dec 21, 2019Updated 6 years ago
Alternatives and similar repositories for scrapy-warcio
Users that are interested in scrapy-warcio are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆13Dec 28, 2022Updated 3 years ago
- A command line utility for listing and searching snapshots in web archives☆17Dec 21, 2023Updated 2 years ago
- sign elf binaries with GPG☆17Oct 10, 2016Updated 9 years ago
- A multi App to download file from LibGen.io☆12Aug 5, 2019Updated 6 years ago
- Basis for constructing a new project on top of mu.semte.ch☆16Mar 1, 2026Updated last month
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Scrape and structure raw data from the Norwegian parliament's API.☆12Oct 24, 2025Updated 5 months ago
- Python wrapper for phonetisaurus grapheme to phoneme tool☆12Mar 11, 2021Updated 5 years ago
- Comparing warc files☆17Feb 21, 2019Updated 7 years ago
- Extracts plain text, language identification and more metadata from WARC records☆23Oct 1, 2025Updated 6 months ago
- Command line tool for digging into WARC files☆51Apr 10, 2026Updated last week
- Single file C header for UTF-x-to-y conversions + helpers☆13Jun 11, 2023Updated 2 years ago
- Norwegian Speech Transformer Models☆19Mar 26, 2026Updated 3 weeks ago
- A trend viewer written in Python/JavaScript☆21Nov 15, 2024Updated last year
- Example configurations for the Community Solid Server☆22Mar 9, 2026Updated last month
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Converts HTTrack crawls to WARC files☆34Aug 6, 2024Updated last year
- CDXJ Indexing of WARC/ARCs☆33Dec 10, 2024Updated last year
- A Simple C++ based CSSParser☆18Apr 12, 2026Updated last week
- Sublime Text API Version Documenter☆11Jan 3, 2023Updated 3 years ago
- Nondestructive warc-in-tar to warc conversion☆27Apr 21, 2013Updated 12 years ago
- Scripts to automate IPv6 maintenance on RouterOS, and more☆15Jan 4, 2026Updated 3 months ago
- Fixed Point Math in C++ for Playstation 1☆12Aug 21, 2023Updated 2 years ago
- PurePortable☆18Feb 26, 2026Updated last month
- The source code for the beyondgrep.com website☆45May 26, 2025Updated 10 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- StumpWM Debugger☆11Apr 19, 2025Updated last year
- DHLAB is a library of python modules for accessing text and pictures at the National Library of Norway.☆25Oct 13, 2025Updated 6 months ago
- Twisted Metal reverse-engineering☆15May 19, 2022Updated 3 years ago
- Some of CrackMes made by me :)☆18Dec 24, 2021Updated 4 years ago
- 🦄 Shades of Purple — A professional theme with hand-picked & bold shades of purple for Base16.☆13Jan 13, 2023Updated 3 years ago
- A simple zsh completion file for borgbackup☆11Nov 16, 2017Updated 8 years ago
- ☆12Jan 17, 2025Updated last year
- ☆15Aug 28, 2025Updated 7 months ago
- library for reading Microsoft Outlook PST files☆41Feb 9, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- clarin-dspace digital repository based on DSpace and LINDAT/CLARIN DSpace☆28Updated this week
- Implementation of Max Kellermann's exploit for CVE-2022-0847☆12Mar 8, 2022Updated 4 years ago
- A MQTT Client for ComputerCraft☆11Jan 27, 2024Updated 2 years ago
- A command utility to read and monitor the NTFS/ReFS USN change Journal.☆22Jul 10, 2025Updated 9 months ago
- Saves proxied HTTP traffic to a WARC file.☆28Oct 22, 2013Updated 12 years ago
- This repository makes available the Talk of Norway (ToN) dataset, a collection of Norwegian parliament speeches from 1998 to 2016. Every …☆31Aug 2, 2023Updated 2 years ago
- RDF vocabulary and hypermedia specification to publish your Linked Data using search trees☆29Jan 14, 2026Updated 3 months ago