joaoventura/WikiCorpusExtractor

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/joaoventura/WikiCorpusExtractor)

joaoventura / WikiCorpusExtractor

Extracts text from WikiMedia XML Dump files

☆24

Alternatives and similar repositories for WikiCorpusExtractor

Users that are interested in WikiCorpusExtractor are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

gregdeon / spotlight
View on GitHub
Implementation of the spotlight: a method for discovering systematic errors in deep learning models
☆11Oct 5, 2021Updated 4 years ago
clinicalml / teaching-to-understand-ai
View on GitHub
Code and webpages for our study on teaching humans to defer to an AI
☆12Nov 6, 2023Updated 2 years ago
kailas-v / human-ai-interactions
View on GitHub
☆11Oct 28, 2022Updated 3 years ago
stanford-policylab / recidivism-predictions
View on GitHub
Replication code "The Limits of Human Predictions of Recidivism" by Lin et al. (2020)
☆10May 1, 2020Updated 6 years ago
ohenrik / nb_dep_ud_sm
View on GitHub
Spacy model trained based on Norwegian corpus converted from OBT to Universal dep.
☆13Jan 31, 2018Updated 8 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
juyongjiang / Awesome-ANCE
View on GitHub
Implementation of paper "Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval"
☆17Jan 10, 2022Updated 4 years ago
crate / crate-dbal
View on GitHub
Doctrine Database Access Layer (DBAL) for CrateDB.
☆16Jul 1, 2026Updated 2 weeks ago
mitharvardwai / mitxharvard-wai-resources
View on GitHub
Compilation of ML/AI Resources for Members of MITxHarvard Women in AI
☆11Mar 28, 2022Updated 4 years ago
jpayne0061 / python_crawler
View on GitHub
this script script no longer works due to changes in Amazon's servers
☆10Mar 12, 2017Updated 9 years ago
utrack / pbtree
View on GitHub
manage your protofile tree and vendor remote files
☆13Sep 6, 2023Updated 2 years ago
xyproto / simpleredis
View on GitHub
Simple way to use Redis from Go
☆25May 8, 2026Updated 2 months ago
clinicalml / human_ai_deferral
View on GitHub
Human-AI Deferral Evaluation Benchmark (Learning to Defer) AISTATS23
☆22Jan 8, 2024Updated 2 years ago
aws-samples / aws-batch-python-sample
View on GitHub
☆19Mar 27, 2020Updated 6 years ago
AlexanderParkin / MCS2018.Baseline
View on GitHub
☆17Feb 25, 2019Updated 7 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
physionetchallenges / evaluation-2019
View on GitHub
Evaluation code for the PhysioNet/CinC Challenge 2019
☆25Nov 10, 2019Updated 6 years ago
kaz-Anova / Competitive_Dai
View on GitHub
The code to generate a top 20 score in the amazon classification challenge using Driverless AI's predictions and feature engineering : In…
☆19Dec 2, 2017Updated 8 years ago
cbmi-uthsc / deepSepsis
View on GitHub
Deep learning model for sepsis prediction using high-frequency data
☆18May 5, 2019Updated 7 years ago
anomalyco / sst-weekly-repos
View on GitHub
Repos from the SST Weekly streams
☆23Sep 2, 2022Updated 3 years ago
carsonfarmer / addc
View on GitHub
Data-structure for online/streaming clustering of non-stationary data.
☆16Jul 15, 2016Updated 10 years ago
lifinance / ask-lifi-docs
View on GitHub
Simple CLI demo for chatting with LIFI docs
☆13Apr 18, 2023Updated 3 years ago
coralproject / ask-wp-plugin
View on GitHub
A WordPress plugin for Ask
☆11Feb 1, 2019Updated 7 years ago
Tenzer / quirky
View on GitHub
QR code printer for your terminal
☆10May 23, 2021Updated 5 years ago
MediaMath / lambda-cron
View on GitHub
LambdaCron - serverless cron tool
☆25Nov 1, 2017Updated 8 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
OpenOil-UG / aleph
View on GitHub
Toys for sifting through large sets of documents.
☆13Feb 3, 2017Updated 9 years ago
phonkee / patrol
View on GitHub
Patrol error logging platform http://patrol.name/
☆24Jun 18, 2015Updated 11 years ago
microsoft / coderec_programming_states
View on GitHub
Code and Data for: Reading Between the Lines: Modeling User Behavior and Costs in AI-Assisted Programming
☆33Feb 23, 2024Updated 2 years ago
openpreserve / matchbox
View on GitHub
Image comparison QA tool for digital preservation workflows.
☆14Nov 17, 2014Updated 11 years ago
atduskgreg / GeneratedDetective
View on GitHub
My NaNoGenMo 2014 project: a generative detective comic
☆16Nov 22, 2014Updated 11 years ago
slanglab / twitteraae
View on GitHub
Code for Blodgett et al. 2016, Demographic dialectal variation in social media
☆26Nov 9, 2019Updated 6 years ago
bakins / lua-resty-beanstalkd
View on GitHub
Simple Beanstalkd client for nginx/openresty
☆16Aug 17, 2012Updated 13 years ago
lordvcs / gpt3-ghost-writer
View on GitHub
A python flask app that generates a spooky story using openai's gpt-3
☆13Feb 20, 2021Updated 5 years ago
bussnet / money
View on GitHub
PHP implementation of Fowler's Money pattern
☆19Apr 22, 2015Updated 11 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
jazzido / crowdata
View on GitHub
Easily crowdsource the analysis of your documents
☆16Jun 4, 2014Updated 12 years ago
elliottslaughter / integrity-checker
View on GitHub
Backup integrity checker
☆21Jun 13, 2023Updated 3 years ago
safferli / james_bond_films
View on GitHub
Analysis of James Bond films
☆14Nov 10, 2015Updated 10 years ago
europeana / europeana-portal-collections
View on GitHub
Europeana Collections portal as a Rails + Blacklight application.
☆19Apr 11, 2022Updated 4 years ago
coralproject / ask-install
View on GitHub
Installer for Ask
☆20Oct 23, 2018Updated 7 years ago
nanopack / flip
View on GitHub
Simple, lightweight, virtual IP management utility for moving IPs around nodes in response to cluster events.
☆21Nov 13, 2015Updated 10 years ago
KnpLabs / PiwikClient
View on GitHub
[UNMAINTAINED] Simple Piwik API client, written in PHP 5.3
☆17Feb 28, 2014Updated 12 years ago