Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.
☆62Jul 1, 2021Updated 4 years ago
Alternatives and similar repositories for exquisite-corpus
Users that are interested in exquisite-corpus are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Access a database of word frequencies, in various natural languages.☆1,653Jan 4, 2025Updated last year
- An LL parser for extracting information from Wiki text, particularly Wiktionary.☆50Aug 16, 2023Updated 2 years ago
- JavaScript port of SymSpell for Node.js☆13Sep 30, 2022Updated 3 years ago
- Tools for indexing gzip files to support random-like access.☆28Mar 15, 2021Updated 5 years ago
- ☆11Nov 14, 2021Updated 4 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- lachesis automates the segmentation of a transcript into closed captions☆35Jan 26, 2017Updated 9 years ago
- ☆27Oct 14, 2022Updated 3 years ago
- The Open Multilingual Wordnet☆72May 6, 2024Updated 2 years ago
- ☆27Oct 22, 2012Updated 13 years ago
- PANiC - PAraphrasing Noun-Compounds☆15Apr 6, 2018Updated 8 years ago
- Builder for .NET for Compiler Explorer☆13Feb 5, 2026Updated 3 months ago
- enable rapid iteration and development of complex data pipelines☆30Mar 9, 2025Updated last year
- Repository accompanying "An Open Dataset and Model for Language Identification" (Burchell et al., 2023)☆75Apr 1, 2025Updated last year
- Data and related code for ACL2019 paper "Implicit Discourse Relation Identification for Open-domain Dialogues"☆12Jul 29, 2019Updated 6 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- An abductive reasoning engine written in C++.☆13Dec 28, 2018Updated 7 years ago
- Various test fonts (OpenType, OpenType with TrueType GX variation extensions, Multiple Master) for testing implementations of font format…☆11Jun 25, 2025Updated 10 months ago
- Object Resource Stream and CDXJ Drafts☆15Nov 28, 2018Updated 7 years ago
- Irony - .NET Language Implementation Kit. Forked/Mirrored from CodePlex to Add Gtk# Explorer☆11Jan 21, 2016Updated 10 years ago
- openFrameworks addon for drawing fonts using signed distance functions (SDF)☆13Jul 16, 2018Updated 7 years ago
- Small string compression using smaz compression algorithm. Fast, because it's in C. Supports Python 3+☆13Oct 18, 2025Updated 6 months ago
- Wrapper for 'unrtf' utility to extract text from RTF documents☆15Feb 26, 2026Updated 2 months ago
- Visual and Embodied Concepts evaluation benchmark☆21Oct 10, 2023Updated 2 years ago
- Browse Android ROMs☆12Dec 5, 2018Updated 7 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- This package supports implementation of anchor-based topic modeling and variants of the anchoring algorithm in Python 3.☆15Sep 17, 2018Updated 7 years ago
- OpusFilter - Parallel corpus processing toolkit☆115Apr 8, 2026Updated last month
- GQR, a Fast Reasoner for Binary Qualitative Constraint Calculi☆20Nov 11, 2017Updated 8 years ago
- Simple type converters: make ints, floats, bools and dates from your strings!☆11Jul 23, 2016Updated 9 years ago
- A library for converting UFOs to SVG fonts.☆17Nov 11, 2022Updated 3 years ago
- Place to store .md notes and host other things related to work I do☆15Jun 20, 2023Updated 2 years ago
- Stanford CoreNLP Extensions: Fork to provide the ability to capture Multi-Word Expressions☆10Jun 14, 2022Updated 3 years ago
- BitTorrent Tracker☆31Oct 22, 2009Updated 16 years ago
- Convert a memrise lesson to anki decks, given a memrise lesson ID☆19Feb 20, 2019Updated 7 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Internet Archive client for the web (JavaScript & NodeJS).☆11Aug 13, 2016Updated 9 years ago
- Launch a Google search for exceptions from Python apps☆28Jan 19, 2015Updated 11 years ago
- Collects a multimodal dataset of Wikipedia articles and their images☆16Mar 25, 2023Updated 3 years ago
- rtpmidi package from the Scenic project: https://github.com/sat-metalab/scenic☆10Oct 27, 2015Updated 10 years ago
- Post a json payload to Rabbit MQ using curl☆11Apr 9, 2018Updated 8 years ago
- Javascript tokenizer for english sentences☆14Oct 15, 2015Updated 10 years ago
- Parses sentences into dependency trees.☆11Aug 21, 2016Updated 9 years ago