How Media Cloud approaches extracting metadata from online news stories
☆17Dec 22, 2024Updated last year
Alternatives and similar repositories for metadata-lib
Users that are interested in metadata-lib are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Summarize and ask questions about items in the Internet Archive☆18Apr 1, 2023Updated 3 years ago
- ☆17Nov 26, 2024Updated last year
- Selected code and data for The Online Books Page and related applications☆11Apr 1, 2026Updated last week
- Automated Damage Assessment using Deep Learning☆14Jun 25, 2025Updated 9 months ago
- Python binding for gumbo-parser using Cython☆14Aug 16, 2016Updated 9 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- A tool for detecting viruses and NSFW material in WARC files☆18Dec 16, 2025Updated 3 months ago
- A whirlwind tour of Common Crawl's data using Python☆38Apr 1, 2026Updated last week
- High Availability Shared Pipeline Engine☆17Sep 15, 2023Updated 2 years ago
- Illuminating the scope and content of a digital text collections☆13Jul 28, 2015Updated 10 years ago
- A classifier for detecting soft 404 pages☆17Sep 10, 2022Updated 3 years ago
- A polite and user-friendly downloader for Common Crawl data☆74Updated this week
- Nyss: a tool developed with the Red Cross for Community-Based Surveillance☆23Oct 26, 2023Updated 2 years ago
- Read and write WARC files in Go☆50Mar 31, 2026Updated 2 weeks ago
- ☆22Jan 9, 2025Updated last year
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- PostGis extension for Kysely☆25Mar 20, 2025Updated last year
- Introduction to data journalism☆14Dec 19, 2018Updated 7 years ago
- Generate machine learning models fully automatically to clasiffiy any images using SERP data☆12Aug 25, 2022Updated 3 years ago
- Library for the Streaming Protocol for Exchange of Astronomical Data (SPEAD)☆27Updated this week
- Collection of all code samples referenced on https://www.samproell.io☆15Apr 8, 2024Updated 2 years ago
- A creator interface for the Pulp viewer.☆24Apr 6, 2016Updated 10 years ago
- Scrape South African news☆12May 22, 2023Updated 2 years ago
- https://www.coursera.org/learn/cryptocurrency☆12Oct 28, 2017Updated 8 years ago
- Front-end for the MediaCloud database☆16Apr 3, 2018Updated 8 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Keyword extraction using Scake, KeyBERT, Fine-tuning Transformer BERT-like models and ChatGPT.☆12May 22, 2023Updated 2 years ago
- libgpiod node bindings☆42Apr 6, 2026Updated last week
- Codebase, data and models for the Headline Grouping paper at NAACL2021☆12Oct 2, 2022Updated 3 years ago
- A Rust crate offering similar functionality to the Python transformers package using Candle.☆14Nov 19, 2024Updated last year
- An apa7 template for quarto/posit☆12Jan 25, 2023Updated 3 years ago
- Reading comprehension based question-answering model for news articles.☆11Jun 22, 2022Updated 3 years ago
- A Solara web app for visualizing Maxar Open Data☆37Mar 16, 2026Updated 3 weeks ago
- 🎧 Simple bash-script to automatically download the most recent podcasts from a list of rss-feeds and upload them to your Dropbox.☆10Nov 30, 2015Updated 10 years ago
- Code for reconstructing full-text news articles from the GDELT Web News NGrams 3.0 dataset☆24Feb 2, 2026Updated 2 months ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- OutRun game written in JavaScript.☆15Jan 27, 2023Updated 3 years ago
- A template for DIN 5008 inspired typst letter☆16Sep 18, 2023Updated 2 years ago
- ☆18Mar 25, 2025Updated last year
- ☆14Jan 25, 2026Updated 2 months ago
- Pre-processing DBpedia datasets to load into Dgraph☆13Mar 6, 2022Updated 4 years ago
- end-to-end information extraction pipeline built by LayoutLMV2, pretrained model from HuggingFace☆11Aug 15, 2023Updated 2 years ago
- Open source RAG with Llama Index for Japanese LLM in low resource settting☆10May 12, 2025Updated 11 months ago