eberlitz/pt-br-corpus

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/eberlitz/pt-br-corpus)

eberlitz / pt-br-corpus

pt-BR Corpus with the Wikipedia dump

☆27

Alternatives and similar repositories for pt-br-corpus

Users that are interested in pt-br-corpus are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

rdenadai / BR-BERTo
View on GitHub
Transformer model for Portuguese language (Brazil pt_BR)
☆16Jul 13, 2026Updated last week
felipeparpinelli / word2vec-pt-br
View on GitHub
Implementação e modelo gerado com o treinamento (trigram) da wikipedia em pt-br
☆38Mar 23, 2017Updated 9 years ago
cgl / portuguese-nlp
View on GitHub
Nlp work on Brazil Portuguese newswire text
☆21Jun 20, 2016Updated 10 years ago
nunorc / squad-v1.1-pt
View on GitHub
Portuguese translation of the SQuAD dataset
☆19Oct 22, 2020Updated 5 years ago
LIAMF-USP / Word2vec-pt
View on GitHub
Tensorflow implementation of the Skipgram model with different scripts to train Portuguese word embeddings.
☆18Aug 26, 2017Updated 8 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Meeds-io / meeds-docker
View on GitHub
Meeds Docker image building
☆13Updated this week
AndrewMishchenko / sqltomongo
View on GitHub
Translates SQL queries to MongoDB queries.
☆11Jun 15, 2017Updated 9 years ago
ulysses-camara / ulysses-segmenter
View on GitHub
Pretrained segmenter models for Portuguese legislative text.
☆16Oct 13, 2024Updated last year
pedrobalage / nlppt
View on GitHub
Python Library for Natural Language Processing for Portuguese Language
☆17Mar 2, 2016Updated 10 years ago
ruanchaves / elmo
View on GitHub
Supporting code for the paper "Portuguese Language Models and Word Embeddings: Evaluating on Semantic Similarity Tasks".
☆11Dec 8, 2022Updated 3 years ago
easynlp / easynlp
View on GitHub
☆11Aug 12, 2021Updated 4 years ago
avijit-thawani / SWOW-eval
View on GitHub
Intrinsic Evaluation of pre-trained word embeddings, using large Word Association Dataset: SWOW (Small World of Words)
☆11Feb 28, 2024Updated 2 years ago
pln-pucrs / cci-regression
View on GitHub
Charlson Comorbidity Index Regression using Clinical Notes
☆10Jul 26, 2018Updated 7 years ago
YuriNiella / RSP
View on GitHub
Refining the Shortest Paths (RSP) of animals tracked with acoustic transmitters in estuarine regions
☆18Nov 25, 2025Updated 8 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
erickrf / assin
View on GitHub
Evaluation and baseline scripts for the ASSIN shared task.
☆11Oct 12, 2019Updated 6 years ago
unicamp-dl / PTT5
View on GitHub
Code for training and evaluating T5 on Portuguese data.
☆91Dec 8, 2022Updated 3 years ago
pln-pucrs / fall-detection
View on GitHub
Fall Detection in EHR using Word Embeddings and Deep Learning
☆16Nov 23, 2020Updated 5 years ago
pablo-tech / BERT-Legal-Classification
View on GitHub
CaseText Court Case analysis with fine-tuned BERT Transformer
☆14Jun 26, 2020Updated 6 years ago
medialab / twitwi
View on GitHub
Collection of Twitter-related helper functions for python.
☆14Feb 24, 2026Updated 5 months ago
Guilherme-B / manifold
View on GitHub
Manifold is a plug-and-play end-to-end real estate asset tracker, from web scraping to ETL (data warehouse) using Python, Go, Apache Airf…
☆14Apr 23, 2021Updated 5 years ago
SkBlaz / tax2vec
View on GitHub
Interpretable feature construction from taxonomies for text classification
☆18Apr 4, 2022Updated 4 years ago
millengustavo / demo-datasus-streamlit
View on GitHub
Demo Application with DataSUS death records and Streamlit
☆11Dec 14, 2019Updated 6 years ago
kanekomasahiro / evaluate_bias_in_mlm
View on GitHub
☆13Dec 1, 2021Updated 4 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
DeaBardhoshi / Data-Science-Projects
View on GitHub
☆12Aug 3, 2024Updated last year
eogasawara / mylibrary
View on GitHub
Research - data mining
☆13Jun 25, 2026Updated last month
ADGEfficiency / climate-news-db
View on GitHub
A database of climate change newspaper articles
☆16Jan 31, 2026Updated 5 months ago
chanddu / Sentence-similarity-based-on-Semantic-nets-and-Corpus-Statistics-
View on GitHub
This is an implementation of the paper written by Yuhua Li, David McLean, Zuhair A. Bandar, James D. O’Shea, and Keeley Crockett
☆20Oct 11, 2019Updated 6 years ago
nonameemnlp2020 / legalBERT
View on GitHub
LEGAL-BERT: Preparing the Muppets for Court
☆15Jun 1, 2020Updated 6 years ago
deepberlin1 / aiforgood2020
View on GitHub
General information about DEEP BERLIN's AI for Good Hackathon 2020
☆11Apr 14, 2020Updated 6 years ago
mustaszewski / europarl-extract
View on GitHub
☆20Jan 10, 2019Updated 7 years ago
metadatacenter / cedar-project
View on GitHub
Build project for all CEDAR Java repositories
☆12Updated this week
loicdtx / pygadm
View on GitHub
Easy access to administrative boundary data with python
☆17Oct 4, 2022Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
UniversalDependencies / UD_Portuguese-Bosque
View on GitHub
This Universal Dependencies (UD) Portuguese treebank.
☆53May 6, 2026Updated 2 months ago
katyfelkner / winoqueer
View on GitHub
☆17Mar 6, 2025Updated last year
gaterslebenchen / JLibFFM
View on GitHub
A Java implementation of LIBFFM: A Library for Field-aware Factorization Machines
☆10Jan 4, 2022Updated 4 years ago
fnielsen / dasem
View on GitHub
Danish Semantic analysis
☆18Sep 24, 2020Updated 5 years ago
jogonba2 / twilbert
View on GitHub
Specialization of BERT architecture both for the Spanish language and the Twitter domain
☆13Nov 6, 2020Updated 5 years ago
alura-cursos / curso_OrientacaoObjetosC01R
View on GitHub
☆11Dec 4, 2022Updated 3 years ago
HAILab-PUCPR / SemClinBr
View on GitHub
SemClinBr - a multi-institutional and multi-specialty semantically annotated corpus for Portuguese clinical NLP tasks
☆37Mar 12, 2024Updated 2 years ago