erickrf/ptwiki2text

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/erickrf/ptwiki2text)

erickrf / ptwiki2text

Python scripts to read a Portuguese Wikipedia XML dump file, parse it and generate plain text files.

☆14

Alternatives and similar repositories for ptwiki2text

Users that are interested in ptwiki2text are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

pedrobalage / Maltparser-Universal-Tree-Bank-PT-BR
View on GitHub
Maltparser trained with the Universal Dependency Treebank for Brazilian-Portuguese Language
☆12May 25, 2015Updated 11 years ago
dasdad / corpus-processor
View on GitHub
Handle linguistic corpus and convert it to use NLP tools
☆21Jul 5, 2013Updated 13 years ago
davidsbatista / awesome-Portuguese-NLP
View on GitHub
A list of libraries and NLP projects for Portuguese
☆19May 22, 2017Updated 9 years ago
cod3licious / textcatvis
View on GitHub
tools to analyze a collection of texts and identify relevant words
☆12May 20, 2018Updated 8 years ago
PKpacheco / meu-portfolio
View on GitHub
App em Django para criação de portfólio pessoal.
☆12Oct 14, 2022Updated 3 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
juntingzh / incremental-learning-baselines
View on GitHub
☆11Jan 16, 2020Updated 6 years ago
El3ct71k / Keylogger
View on GitHub
☆15Mar 2, 2014Updated 12 years ago
guilhermedonizetti / OCR_Python
View on GitHub
Aplicação em Python para Optical Character Recognition (OCR), uma técnica para extrair textos em imagens. Adicionalmente, o programa tent…
☆12Aug 13, 2021Updated 4 years ago
jehugaleahsa / primitive
View on GitHub
Provide user-defined initialization semantics for arithmetic types.
☆11Mar 29, 2026Updated 3 months ago
JeepShen / vscode-markdown-code-runner
View on GitHub
Run code snippets in Markdown.
☆21Jul 1, 2023Updated 3 years ago
brunoalano / aiotf
View on GitHub
Asyncio Tensorflow Serving Communication
☆10Feb 2, 2023Updated 3 years ago
low-ghost / nerdtree-fugitive
View on GitHub
A plugin that adds some fugitive functionality directly to nerdtree for vim
☆25Sep 11, 2015Updated 10 years ago
aws-samples / amazon-comprehend-medical-omop-notes-mapping
View on GitHub
Use Amazon Comprehend Medical to extract medical insight from notes inside the OMOP Common Data Model
☆14Feb 28, 2019Updated 7 years ago
iulia-b10 / query_transformations
View on GitHub
☆13Jan 8, 2024Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
volkanaktas / TurtaRoleKontrol
View on GitHub
Raspberry Pi Turta röle kartını görsel arayüz üzerinden kontrol eden python dili ile yazılmış program
☆10Nov 30, 2016Updated 9 years ago
PacktPublishing / Microsoft-Power-BI-Performance-Best-Practices-Second-Edition
View on GitHub
"Microsoft Power BI Performance Best Practices - Second Edition, published by Packt"
☆13Mar 2, 2026Updated 4 months ago
IBM / watson-streaming-stt
View on GitHub
Example of using Watson's Streaming Speech to Text websockets interface for real time transcription. Written in Python. WARNING: This rep…
☆29May 27, 2020Updated 6 years ago
holgersindbaek / status_bar
View on GitHub
RubyMotion status bar wrapper.
☆17Nov 10, 2013Updated 12 years ago
turing-usp / conceitos-basicos-NLP
View on GitHub
Aulas de conceitos básicos de Processamento de Linguagem Natural oferecida no Discord aberto no Turing USP
☆10Jul 30, 2021Updated 4 years ago
lkshrsch / BreastCancerDiagnosisMRI
View on GitHub
☆15Apr 17, 2026Updated 3 months ago
lykmapipo / ngData
View on GitHub
Simple and minimal WebSQL and cordova SQLite ORM for ionic and angular
☆10Mar 5, 2016Updated 10 years ago
HaoWeiHe / Knowledge-Graph
View on GitHub
how to build up Knowledge graph
☆13Nov 16, 2021Updated 4 years ago
PacktPublishing / Hands-On-Data-Analytics-for-Beginners-with-Google-Colaboratory-Video-
View on GitHub
Hands-On Data Analytics for Beginners with Google Colaboratory [Video], published by Packt
☆18Jan 15, 2021Updated 5 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
open-risk / openSecuritisation
View on GitHub
Demonstrating technical elements in support of open source securitisation frameworks
☆15Sep 5, 2024Updated last year
jlawrence6809 / CSS-Selector-Helper-for-Chrome
View on GitHub
☆10Jan 1, 2026Updated 6 months ago
ptbrowne / nlp
View on GitHub
project nlp
☆25Sep 20, 2012Updated 13 years ago
mateuspadua / django-admin-report
View on GitHub
Crie relátorios utilzando todo o potencial do admin django
☆15Dec 19, 2019Updated 6 years ago
zendesk / sunshine-conversations-ruby
View on GitHub
Smooch API Library for Ruby
☆15Feb 5, 2026Updated 5 months ago
ucbrise / jarvis
View on GitHub
Build, configure, and track workflows with Jarvis.
☆14Apr 17, 2018Updated 8 years ago
idwall / desafios-iddog
View on GitHub
Desafio iddog para frontend e mobile
☆22Jun 15, 2020Updated 6 years ago
ucam-department-of-psychiatry / crate
View on GitHub
Create and use de-identified research databases. Preprocess, extract text, anonymise/de-identify, link, apply natural language processing…
☆24Updated this week
PacktPublishing / Coding-with-ChatGPT-and-Other-LLMs
View on GitHub
Coding with ChatGPT and other LLMs, published by Packt
☆16Dec 9, 2024Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
udibr / pointer-generator
View on GitHub
Code for the ACL 2017 paper "Get To The Point: Summarization with Pointer-Generator Networks"
☆13Jul 5, 2017Updated 9 years ago
TerryCavanagh / triangle-run
View on GitHub
My game for the #stopwaitingforgodot game jam
☆28Sep 12, 2021Updated 4 years ago
gsi-upm / scaner
View on GitHub
Social Context Analysis aNd Emotion Recognition
☆12Jul 11, 2017Updated 9 years ago
alvations / expletives
View on GitHub
Expletives vomiting library...
☆13Apr 18, 2026Updated 3 months ago
shmulvad / zero-for-ner
View on GitHub
Zero-Shot Learning in Named Entity Recognition with Common Sense Knowledge
☆17Nov 16, 2021Updated 4 years ago
hslh / pie-detection
View on GitHub
Automatic Detection of Potentially Idiomatic Expressions
☆12Feb 19, 2021Updated 5 years ago
MtDalPizzol / webpack-starter
View on GitHub
A Webpack boilerplate with ES6 and SCSS for simple web projects.
☆11Oct 27, 2016Updated 9 years ago