A set of Python scripts for preprocessing the Wikidata JSON dump and running simple queries in an efficient manner.
☆142Oct 17, 2024Updated last year
Alternatives and similar repositories for simple-wikidata-db
Users that are interested in simple-wikidata-db are comparing it to the libraries listed below
Sorting:
- Mapping Wikipedia pages to Wikidata IDs and vice versa.☆174May 11, 2023Updated 2 years ago
- [EMNLP 2022] TemporalWiki: A Lifelong Benchmark for Training and Evaluating Ever-Evolving Language Models☆74May 15, 2024Updated last year
- ReFinED is an efficient and accurate entity linking (EL) system.☆235Dec 13, 2024Updated last year
- PyTorch - Albert Large V2, Bert Base Uncased, Bert Large Uncased WWM Finetuned Squad, Distil Roberta Base, Roberta Base Squad2, Roberta l…☆11Jul 10, 2020Updated 5 years ago
- init☆13Feb 3, 2021Updated 5 years ago
- Enhancing Legal Case Retrieval via Scaling High-quality Synthetic Query-Candidate Pairs (EMNLP 2024)☆16Nov 17, 2024Updated last year
- ☆20Feb 14, 2023Updated 3 years ago
- ☆15Jul 8, 2024Updated last year
- import a subset or a full Wikidata dump into a CouchDB database☆21Sep 10, 2024Updated last year
- Source code for TACL paper "KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation".☆213May 3, 2024Updated last year
- Official Repository for paper "Ontology-Free General-Domain Knowledge Graph-to-Text Generation Dataset Synthesis using Large Language Mod…☆15Nov 25, 2024Updated last year
- [EMNLP 2022] Code for our paper “ZeroGen: Efficient Zero-shot Learning via Dataset Generation”.☆16Feb 18, 2022Updated 4 years ago
- ☆33Jan 11, 2024Updated 2 years ago
- Brave is a simple visualisation library for NLP information extraction, built on top of embedded BRAT.☆15Dec 25, 2019Updated 6 years ago
- ☆33Aug 26, 2025Updated 6 months ago
- ☆36Feb 21, 2025Updated last year
- Pytorch implementation of EntQA paper☆65May 21, 2022Updated 3 years ago
- Benchmark API for Multidomain Language Modeling☆25Aug 26, 2022Updated 3 years ago
- Unified Representations of Structured and Unstructured Knowledge for Open-Domain Question Answering☆50Aug 2, 2022Updated 3 years ago
- Webextension experiment proposing 'printservice' Javascript API☆10Jul 25, 2018Updated 7 years ago
- Source code for "Revisiting Unsupervised Relation Extraction" in ACL 2020☆36Jun 20, 2023Updated 2 years ago
- ☆15May 26, 2021Updated 4 years ago
- a script to get a JSON file listing wikidata properties ids and their label in a given language☆34Feb 21, 2017Updated 9 years ago
- Neural Language Models for Historical Research☆29Oct 16, 2024Updated last year
- PyTorch DataLoader for many VQA datasets☆14Jan 10, 2023Updated 3 years ago
- Monorepo containing all addwiki libraries, packages and applications☆17Feb 17, 2026Updated last month
- ☆68Oct 27, 2023Updated 2 years ago
- EMNLP2023 - InfoSeek: A New VQA Benchmark focus on Visual Info-Seeking Questions☆25May 30, 2024Updated last year
- [ECCV'24] Official Implementation of Autoregressive Visual Entity Recognizer.☆14Mar 2, 2024Updated 2 years ago
- The software associated with a paper accepted at EMNLP 2021 titled "Open Knowledge Graphs Canonicalization using Variational Autoencoders…☆16Sep 27, 2021Updated 4 years ago
- Code for our TSD paper "TOKEN is a MASK: Few-shot Named Entity Recognition with Pre-trained Language Models"☆14Aug 19, 2022Updated 3 years ago
- Own pywikibot scripts (for Wikimedia projects)☆22Nov 30, 2025Updated 3 months ago
- Autoregressive Entity Retrieval☆796Jul 6, 2023Updated 2 years ago
- Wikidata Subsetting☆17Feb 26, 2023Updated 3 years ago
- Repo for the paper: Towards Few-shot Entity Recognition in Document Images:A Label-aware Sequence-to-Sequence Framework☆14May 31, 2023Updated 2 years ago
- spaCy module for linking text to Wikidata items☆243Mar 9, 2023Updated 3 years ago
- ☆33Jul 25, 2024Updated last year
- Tools to process OpenAlex raw snapshot files☆12Jan 17, 2025Updated last year
- A dataset for realistic evaluation of noisy label methods☆14Dec 3, 2023Updated 2 years ago