Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.
☆62Jul 1, 2021Updated 4 years ago
Alternatives and similar repositories for exquisite-corpus
Users that are interested in exquisite-corpus are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Access a database of word frequencies, in various natural languages.☆1,646Jan 4, 2025Updated last year
- Loads OpenSubtitles v2018 dataset without having to load everything into memory at once. Works well with pytorch.☆13Aug 26, 2020Updated 5 years ago
- Ini file library for Python☆31Jun 7, 2023Updated 2 years ago
- lachesis automates the segmentation of a transcript into closed captions☆35Jan 26, 2017Updated 9 years ago
- ☆15Sep 20, 2011Updated 14 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Serapis is a sentence identifier and modeling pipeline / built for Wordnik☆24Jun 9, 2016Updated 9 years ago
- Entry task to python intermediate course☆10Feb 8, 2017Updated 9 years ago
- ☆10Jun 11, 2019Updated 6 years ago
- a Haskell library that implements (Projective) Discourse Representation Theory (DRT)☆27Sep 15, 2022Updated 3 years ago
- TPTP python library and benchmarking service☆13Oct 2, 2019Updated 6 years ago
- ☆27Oct 22, 2012Updated 13 years ago
- A rule engine based on Attempto Controlled English☆18Nov 1, 2024Updated last year
- Tool to fix bitexts and tag near-duplicates for removal☆34Sep 4, 2025Updated 7 months ago
- enable rapid iteration and development of complex data pipelines☆30Mar 9, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Data and all☆14Sep 30, 2019Updated 6 years ago
- Comprehensive LLM evaluation framework: GPQA Diamond to Chatbot Arena. Tests all major models equally, easily extensible.☆17Aug 22, 2024Updated last year
- Resize photo ID images using face recognition technology☆14Dec 7, 2020Updated 5 years ago
- Various test fonts (OpenType, OpenType with TrueType GX variation extensions, Multiple Master) for testing implementations of font format…☆11Jun 25, 2025Updated 9 months ago
- Code repository for the WWW 2019 paper "Predicting ConceptNet Path Quality Using Crowdsourced Assessments of Naturalness"☆12Feb 1, 2019Updated 7 years ago
- An abductive reasoning engine written in C++.☆13Dec 28, 2018Updated 7 years ago
- Irony - .NET Language Implementation Kit. Forked/Mirrored from CodePlex to Add Gtk# Explorer☆11Jan 21, 2016Updated 10 years ago
- ☆45Mar 26, 2026Updated 3 weeks ago
- openFrameworks addon for drawing fonts using signed distance functions (SDF)☆13Jul 16, 2018Updated 7 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- An open-source NLP library: fast text cleaning and preprocessing☆23Nov 9, 2021Updated 4 years ago
- An implementation of Defeasible Logic in Python☆15Sep 2, 2018Updated 7 years ago
- Small string compression using smaz compression algorithm. Fast, because it's in C. Supports Python 3+☆13Oct 18, 2025Updated 6 months ago
- Browse Android ROMs☆12Dec 5, 2018Updated 7 years ago
- Pretty-print markdown☆32Feb 5, 2013Updated 13 years ago
- GQR, a Fast Reasoner for Binary Qualitative Constraint Calculi☆19Nov 11, 2017Updated 8 years ago
- Simple type converters: make ints, floats, bools and dates from your strings!☆11Jul 23, 2016Updated 9 years ago
- BitTorrent Tracker☆31Oct 22, 2009Updated 16 years ago
- Hacker's Guide to Visual FoxPro☆10May 18, 2023Updated 2 years ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- rtpmidi package from the Scenic project: https://github.com/sat-metalab/scenic☆10Oct 27, 2015Updated 10 years ago
- Natural Language Processing tools☆12Jan 26, 2017Updated 9 years ago
- DEPRECATED Export Members of a Facebook Group to a CSV☆13Jun 30, 2020Updated 5 years ago
- Multicoloure, a SVGinOT/OpenType-SVG color font TTF & WOFF based on Multicolore Vector Typeface☆18Jun 7, 2016Updated 9 years ago
- Javascript tokenizer for english sentences☆14Oct 15, 2015Updated 10 years ago
- Parses sentences into dependency trees.☆11Aug 21, 2016Updated 9 years ago
- Source Devanagari Sans☆20Oct 28, 2025Updated 5 months ago