noelmartinon / mboxzillaLinks
Export / upload emails from Thunderbird mbox files to single eml files
☆23Updated 2 years ago
Alternatives and similar repositories for mboxzilla
Users that are interested in mboxzilla are comparing it to the libraries listed below
Sorting:
- A wrapper for tesseract / abbyyOCR11 ocr4linux finereader cli that can perform batch operations or monitor a directory and launch an OCR …☆65Updated last year
- Automated behaviors that run in browser to interact with complex sites automatically. Used by ArchiveWeb.page and Browsertrix Crawler.☆48Updated last week
- Recover lost websites from the Web Infrastructure☆89Updated 3 weeks ago
- ☆39Updated last year
- Search engine for structured data☆24Updated 5 months ago
- Universal backend for indexing, storing, and querying documents.☆25Updated 5 years ago
- Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.☆128Updated last week
- Tool to index and serve HTML files. Powered by Datasette.☆105Updated 3 years ago
- A PDF classifier ensemble with REST API service☆23Updated 4 years ago
- Make your PDF files text-searchable (A GUI for OCRmyPDF)☆46Updated last year
- CLI utility to find duplicate files☆116Updated 2 years ago
- Tool for downloading all images from a given Flickr account in parallel for max download speed. All images are downloaded in original siz…☆23Updated 5 years ago
- Tool and library for handling Web ARChive (WARC) files.☆163Updated 10 months ago
- Trough: Big data, small databases.☆42Updated last year
- Short script for removing watermarks from PDF files. Requires pdftk.☆59Updated 6 years ago
- A simple Python wrapper and command-line interface for archive.org’s "Save Page Now" capturing service☆182Updated 10 months ago
- Prepress preparing tool and PDF editor☆18Updated last year
- ☆78Updated 3 years ago
- A simple Python script that archives all the messages from a public Yahoo Group☆59Updated 5 years ago
- Fast PDF generation and compression. Deals with millions of pages daily.☆122Updated last week
- Docker Compose based system for running remote browsers (including Flash and Java support) connected to web archives☆15Updated 4 years ago
- A small, simple cross-platform utility to process many files in parallel.☆84Updated 5 years ago
- Command line tools and libraries for handling and manipulating WARC files (and HTTP contents)☆164Updated 2 weeks ago
- Simplified version of a common crawl fetcher☆16Updated this week
- Charset converter tool and library☆139Updated 5 months ago
- Eudora for Windows source code☆12Updated 7 years ago
- ☆58Updated last year
- PDF to DjVu converter☆100Updated last year
- Personal WayBack Machine☆128Updated 5 years ago
- Produce, verify and repair par2 files recursively.☆114Updated 8 months ago