shyamupa/wikidump_preprocessing

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/shyamupa/wikidump_preprocessing)

shyamupa / wikidump_preprocessing

Extracting useful metadata from Wikipedia dumps in any language.

☆26

Alternatives and similar repositories for wikidump_preprocessing

Users that are interested in wikidump_preprocessing are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

shyamupa / xling-el
View on GitHub
pytorch model for cross-lingual entity linking.
☆16Mar 13, 2019Updated 7 years ago
mohit3011 / Online-Antisemitism-Detection-Using-MultimodalDeep-Learning
View on GitHub
Repository for our paper “Subverting the Jewtocracy”: Online Antisemitism Detection Using MultimodalDeep Learning
☆12Apr 29, 2022Updated 4 years ago
uhh-lt / wsd
View on GitHub
A system for unsupervised knowledge-free interpretable word sense disambiguation based on distributional semantics
☆19Mar 25, 2018Updated 8 years ago
studio-ousia / textent
View on GitHub
Representation Learning of Entities and Documents from Knowledge Base Descriptions
☆18Oct 6, 2018Updated 7 years ago
harvardnlp / annotated-attention
View on GitHub
☆15Aug 8, 2018Updated 7 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
zaemyung / wikiextractor
View on GitHub
A tool for extracting plain text from Wikipedia dumps
☆15Oct 3, 2019Updated 6 years ago
ChicagoHAI / decsum
View on GitHub
Implementation for Decision-focused Summarization (EMNLP2021)
☆12Mar 14, 2022Updated 4 years ago
salesforce / hydra-sum
View on GitHub
☆10May 1, 2025Updated last year
dsindex / segm-lstm
View on GitHub
[deprecated] reference code for string segmentation using LSTM(tensorflow)
☆19Feb 19, 2020Updated 6 years ago
Hannibal046 / SDDS
View on GitHub
[ACL2023] Source code for Dialogue Summarization with Static-Dynamic Structure Fusion Graph
☆11Dec 17, 2023Updated 2 years ago
cttsai / illinois-cross-lingual-wikifier
View on GitHub
☆24Sep 28, 2017Updated 8 years ago
robert-lieck / RBN
View on GitHub
Recursive Bayesian Networks
☆11May 11, 2025Updated last year
abhipec / fnet
View on GitHub
Fine-Grained Entity Type Classification by Jointly Learning Representations and Label Embeddings
☆19Feb 26, 2019Updated 7 years ago
suhaibani / JointReps
View on GitHub
Learning word representation jointly using a corpus and a knowledge base (KB)
☆19Oct 19, 2018Updated 7 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Merterm / Modeling-Intensification-for-SLG
View on GitHub
Public repo for the paper: "Modeling Intensification for Sign Language Generation: A Computational Approach" by Mert Inan*, Yang Zhong*, …
☆14Mar 15, 2022Updated 4 years ago
microsoft / iclr2019-learning-to-represent-edits
View on GitHub
Code for the ICLR 2019 paper "Learning to Represent Edits"
☆13Dec 8, 2022Updated 3 years ago
scriptjunkie / sessionthief
View on GitHub
Session hijacking GUI tool
☆15Oct 20, 2013Updated 12 years ago
danieldeutsch / qaeval
View on GitHub
☆15Aug 3, 2021Updated 4 years ago
VanderpoelLiam / CPMI
View on GitHub
Mutual Information Predicts Hallucinations in Abstractive Summarization
☆13Nov 14, 2022Updated 3 years ago
cs329yangzhong / specificityTwitter
View on GitHub
Censored tweets annotated for specificity; AAAI 2019 paper: Predicting and Analyzing Language Specificity in Social Media Posts
☆11Oct 19, 2021Updated 4 years ago
jahid56 / doctor
View on GitHub
Hospital & Doctor Information System from Bangladesh. It has also Doctor admin panel to update a doctors information. One can also bookin…
☆11Aug 22, 2016Updated 9 years ago
BaseMax / CFG2CNF
View on GitHub
Python program to convert a Context Free Grammar to Chomsky Normal Form.
☆10May 9, 2025Updated last year
ys1998 / vae-latent-structure
View on GitHub
PyTorch implementation of "Variational Autoencoders with Jointly Optimized Latent Dependency Structure" [ICLR 2019]
☆13Jul 14, 2019Updated 7 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
thombashi / typepy
View on GitHub
A Python library for variable type checker/validator/converter at a run time.
☆17Updated this week
velocityCavalry / CREPE
View on GitHub
An original implementation of the paper "CREPE: Open-Domain Question Answering with False Presuppositions"
☆16Nov 5, 2024Updated last year
othr-nlp / rage_toolkit
View on GitHub
☆11Sep 27, 2024Updated last year
napsternxg / WikiUtils
View on GitHub
A set of utility scripts to process Wikipedia related data
☆38Jul 2, 2022Updated 4 years ago
ayusharma / medical_project
View on GitHub
A system to prescribe the medicine for general symptoms is the the 2nd year undergraduate college project which is developed in PHP and m…
☆13Oct 14, 2018Updated 7 years ago
nirlipo / ltl2pddl
View on GitHub
LTL2PDDL tool
☆13Jul 7, 2017Updated 9 years ago
Wendy-Xiao / redundancy_reduction_longdoc
View on GitHub
This is the official code for the paper 'Systematically Exploring Redundancy Reduction inSummarizing Long Documents'.
☆16Apr 30, 2021Updated 5 years ago
studio-ousia / ntee
View on GitHub
Neural Text-Entity Encoder (NTEE)
☆81Aug 16, 2017Updated 8 years ago
rebeccak1 / conversion_rates
View on GitHub
Predict conversion rate and generate ideas to improve conversion rate
☆10Nov 3, 2017Updated 8 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
ContextScout / ned-graphs
View on GitHub
☆38Oct 26, 2018Updated 7 years ago
violet-zct / swarm-distillation-zero-shot
View on GitHub
☆23Oct 15, 2022Updated 3 years ago
izuna385 / Entity-Linking-Tutorial
View on GitHub
Bi-encoder Based Entity Linking Tutorial. You can run experiment only in 5 minutes. Experiments on Co-lab pro GPU are also supported!
☆34May 3, 2021Updated 5 years ago
bapspatil / CaptainChef
View on GitHub
A Material design baking/cooking recipes app.
☆11Feb 9, 2019Updated 7 years ago
ArmaanSethi / Hindsight-Experience-Replay-and-Hierarchical-Reinforcement-Learning
View on GitHub
Comp 781 Project
☆10Jan 2, 2026Updated 6 months ago
susravan / Edge-and-light-detection-android-app
View on GitHub
An android app that shows the edges and light sources in the live feed from the phone's camera
☆11Sep 11, 2017Updated 8 years ago
facebookresearch / randsent
View on GitHub
Exploring Random Encoders for Sentence Classification
☆184Mar 6, 2020Updated 6 years ago