scribe-org / Scribe-Data
Wikidata, Wiktionary and Wikipedia language data extraction
☆30Updated this week
Related projects ⓘ
Alternatives and complementary repositories for Scribe-Data
- Markdownloader makes it possible to fetch articles directly from the web. Inspired by the popular outline.com, it also ditches paywalls i…☆12Updated 3 years ago
- Self tracking your browser history!☆20Updated 10 months ago
- Static Site Generator for Viewing Web Archives (in WACZ) format☆21Updated last year
- Susie checks GitHub repositories for sustainability and provides interesting knowledge for developers regarding sustainable software deve…☆24Updated 5 months ago
- Digital Preservation of HTTP in documentary heritage.☆22Updated last year
- H2O is a web app for creating and reading open educational resources, primarily in the legal field☆37Updated last month
- This repo stores manifests of some public Wikibase instances. Manifests v2 are compatible with OpenRefine 3.6 and later versions.☆19Updated 2 years ago
- how hard is it to get a list of all local news sites in the United States (LOL)☆8Updated 4 years ago
- A tool for detecting viruses and NSFW material in WARC files☆11Updated 3 months ago
- A Fediverse robot account that posts the latest public records requests filed and completed at muckrock.com☆14Updated this week
- Synchronize your Mastodon bookmarks to bookmarking services☆12Updated this week
- Daily TV News Summary using GPT☆21Updated 7 months ago
- Datasette plugin for uploading CSV files and converting them to database tables☆24Updated 7 months ago
- Storyfinder - A Browser Plugin and Server Backend for Personalized Knowledge- and Information Management☆15Updated 8 months ago
- Website repository for Govdirectory - a crowdsourced and fact-checked directory of official governmental online accounts and services.☆49Updated this week
- TextractAI: Extract and process text from PDFs using Python, OpenAI API, and OCR techniques.☆11Updated 7 months ago
- Adds a reconciliation API endpoint to Datasette, based on the Reconciliation Service API specification.☆23Updated 9 months ago
- Web application for distributed compute analysis of Archive-It web archive collections.☆15Updated 2 months ago
- Python Module to use the Readwise API☆16Updated this week
- wrapper for the crossref events api☆17Updated last year
- Telegram > OpenAI > Read Later [instapaper/pocket/omnivore]☆16Updated last year
- Awesome list dedicated to digital and data preservation tools, sources, services and so on.☆20Updated 2 years ago
- Jurisdiction ID and abbreviation data files for using with Jurism and other projects.☆34Updated last year
- A swarm of LLM agents that will help you test, document, and productionize your code!☆11Updated last week
- Repository hosting the common code for the entity-fishing clients☆9Updated 6 months ago
- The GitHub repository containing all the material related to the Computational Thinking and Programming course of the Digital Humanities …☆30Updated 4 years ago
- A deep learning model for extracting references from text☆25Updated last year
- Awesome List of Women in Open Source☆37Updated 4 years ago
- A digital factory platform for managing files online with stable IDs, high-quality metadata, powerful API and tools for building on data:…☆42Updated 4 months ago
- A Python tool to search for and remove duplicated files in messy datasets☆15Updated 3 weeks ago