Analyze and extract Wikipedia article text and attributes and store them into an ElasticSearch index or to json files (multilingual support)
☆48Aug 14, 2023Updated 2 years ago
Alternatives and similar repositories for wikipedia-to-elastic
Users that are interested in wikipedia-to-elastic are comparing it to the libraries listed below
Sorting:
- ☆17Oct 25, 2018Updated 7 years ago
- Detecting Trends in Job Advertisements☆20Aug 13, 2018Updated 7 years ago
- OKR: A Consolidated Open Knowledge Representation for Multiple Texts☆41Jan 25, 2018Updated 8 years ago
- Web application for interactive graphs, anomaly highlighting and online monitoring.☆17Mar 15, 2016Updated 9 years ago
- Collection of code snippets and utilities for streamlit apps☆22Apr 2, 2020Updated 5 years ago
- A field-tested Hebrew tokenizer for dirty texts (ben-yehuda project, bible, cc100, mc4, opensubs, oscar, twitter) focused on multi-word e…☆23Aug 13, 2022Updated 3 years ago
- Web hub based on Wikidata☆38Dec 7, 2025Updated 3 months ago
- OVALChat is a customizable Web app aimed at conducting user studies with chatbots☆28Jan 9, 2024Updated 2 years ago
- Extract transform load CLI tool for extracting small and middle data volume from sources (databases, csv files, xls files, gspreadsheets)…☆11Mar 1, 2026Updated last week
- Graph-based framework for text classification☆24Oct 4, 2018Updated 7 years ago
- Twitter Text Libraries for Python☆29Oct 17, 2023Updated 2 years ago
- Search comments and highlights annotations in PDF documents.☆12May 4, 2023Updated 2 years ago
- ☆32Aug 4, 2021Updated 4 years ago
- A coreference evaluation package for the CoNLL and ARRAU datasets☆42Oct 3, 2020Updated 5 years ago
- ☆13May 8, 2024Updated last year
- ☆10Nov 20, 2024Updated last year
- Abstract Meaning Representation (AMR) reader☆35Feb 24, 2020Updated 6 years ago
- Applescripts for controlling Spotify☆23Oct 20, 2016Updated 9 years ago
- The Rails application for Turkopticon☆10May 10, 2017Updated 8 years ago
- Index and Search Your Private PDF Collection☆18Jan 16, 2016Updated 10 years ago
- A framework for evaluating Machine Translation models.☆12May 26, 2025Updated 9 months ago
- Abusing Certificate Transparency logs for getting HTTPS websites subdomains.☆11Mar 2, 2019Updated 7 years ago
- An autonomous service implementing a decentralized Impact Evaluator☆13Updated this week
- ☆37Jun 12, 2023Updated 2 years ago
- Github mirror - our actual code is hosted with Gerrit (please see https://www.mediawiki.org/wiki/Developer_access for contributing)☆37Jun 10, 2024Updated last year
- Multiple correspondence analysis☆10Apr 2, 2015Updated 10 years ago
- Drag-and-drop to find text. A work in progress.☆14Oct 6, 2022Updated 3 years ago
- Parses a document (scanned or phone captured) and returns the underlying question - answer layout structured capture by LayoutXLM model☆10Jun 14, 2021Updated 4 years ago
- Convert ALTO XML to plain text + minimal metadata☆17Oct 17, 2024Updated last year
- Neural machine translation with Recurrent Deterministic Policy Gradient☆10Aug 18, 2016Updated 9 years ago
- minimalistic wrapper for chatgpt api for better prompt engineering☆11Jul 9, 2023Updated 2 years ago
- A place to share and discover feeds.☆14Aug 4, 2023Updated 2 years ago
- Trigger an LLM in your CI/CD to auto-complete your work☆11Apr 5, 2023Updated 2 years ago
- Generate zsh completion functions from manpage or `--help`☆10Mar 18, 2020Updated 5 years ago
- A simple module/way to use Perplexity AI in Python.☆13May 9, 2024Updated last year
- ☆10Jun 9, 2017Updated 8 years ago
- ☆10Apr 4, 2023Updated 2 years ago
- ☆10Oct 28, 2019Updated 6 years ago
- Code for "The Whole Truth and Nothing But the Truth: Faithful and Controllable Dialogue Response Generation with Dataflow Transduction an…☆11Apr 30, 2024Updated last year