leonardr / cce-pythonLinks
Python tools for processing data from the Catalog of Copyright Entries
☆38Updated 5 years ago
Alternatives and similar repositories for cce-python
Users that are interested in cce-python are comparing it to the libraries listed below
Sorting:
- NYPL Project to transcribe and parse pages from the US Catalog of Copyright Entries☆58Updated 2 years ago
- Tab-delimited versions of Catalog of Copyright Entries renewals☆28Updated 6 years ago
- A dockerized, queued high fidelity web archiver based on Squidwarc☆60Updated 11 months ago
- track changes to the news, where news is anything with an RSS feed☆178Updated 5 years ago
- recursively deduplicate a directory and write its contents to a new directory while remembering the old paths☆49Updated 4 years ago
- National Poetry Generation Month 2017☆13Updated 8 years ago
- A command line utility for listing and searching snapshots in web archives☆16Updated last year
- Chrome extension that uses Memento to indicate that a page a user is viewing on the live web has an archived copy and to give the user ac…☆54Updated 4 months ago
- Documentation for the GITenberg books project☆29Updated 6 years ago
- Convert Directories, Files and ZIP Files to Web Archives (WARC)☆85Updated 2 months ago
- Check out https://github.com/webrecorder/webrecorder for newer version matching https://webrecorder.io☆38Updated 9 years ago
- The One True Open Access Button - cross-compatible extension for research papers and data.☆45Updated 8 months ago
- My name vs Oxymnndms, kmq of km / Look on my works, ye uny, and despaw☆27Updated 7 years ago
- Insert matching punctuation for mismatched quotation marks, parentheses, etc. Good postprocessing for N-gram text synthesis.☆15Updated 9 years ago
- Nondestructive warc-in-tar to warc conversion☆26Updated 12 years ago
- A command-line tool for interacting with books in git☆111Updated 10 months ago
- command line resource for working with digital primary sources☆27Updated 6 years ago
- NaNoGenMo☆36Updated 8 months ago
- 📚 A compilation of research relevant to Data Together's efforts tackling the general problem of data resilience & interactivity☆95Updated 6 years ago
- Comparing warc files☆17Updated 6 years ago
- A list of things related to software, literature, and other content for 🕣 Memento☆99Updated last year
- Pages repo☆89Updated 3 years ago
- Automatic alignment of books between HathiTrust, Internet Archive, Google Books, etc.☆35Updated 2 months ago
- workspace for the development of xZINECOREx metadata schema for cataloging print zines☆13Updated 6 years ago
- Library of Congress coding standards☆30Updated last year
- Sort-friendly URI Reordering Transform (SURT) python module☆42Updated 10 months ago
- Strips boilerplate from Project Gutenberg text files☆16Updated 3 years ago
- Raspberry Pi image for controlling a DIYBookScanner via spreads☆37Updated 10 years ago
- Create markov chain ("_ebooks") accounts on Twitter☆59Updated 5 years ago
- Some tools to help analyze the twitter archive☆62Updated 2 weeks ago