Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.
☆61Jul 1, 2021Updated 4 years ago
Alternatives and similar repositories for exquisite-corpus
Users that are interested in exquisite-corpus are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Access a database of word frequencies, in various natural languages.☆1,641Jan 4, 2025Updated last year
- An LL parser for extracting information from Wiki text, particularly Wiktionary.☆50Aug 16, 2023Updated 2 years ago
- JavaScript port of SymSpell for Node.js☆13Sep 30, 2022Updated 3 years ago
- Tools for indexing gzip files to support random-like access.☆28Mar 15, 2021Updated 5 years ago
- GOPHI: an AMR-to-English Verbalizer☆11Feb 5, 2020Updated 6 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- lachesis automates the segmentation of a transcript into closed captions☆35Jan 26, 2017Updated 9 years ago
- Serapis is a sentence identifier and modeling pipeline / built for Wordnik☆24Jun 9, 2016Updated 9 years ago
- ☆10Jun 11, 2019Updated 6 years ago
- a Haskell library that implements (Projective) Discourse Representation Theory (DRT)☆27Sep 15, 2022Updated 3 years ago
- Some tools for variable fonts☆26Jun 29, 2025Updated 8 months ago
- ☆27Oct 22, 2012Updated 13 years ago
- Resources accompanying the "Zero-Shot Recommendation as Language Modeling" paper (ECIR2022)☆14May 25, 2023Updated 2 years ago
- Tool to fix bitexts and tag near-duplicates for removal☆34Sep 4, 2025Updated 6 months ago
- Data and all☆14Sep 30, 2019Updated 6 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Mac上VPN(Cisco IPSec协议)国内国外IP自动分流访问☆11Feb 8, 2018Updated 8 years ago
- An Obsidian plugin that controls file indexing by managing dot prefixes on files/folders to truly hide/exclude them, providing a .gitigno…☆22Mar 18, 2026Updated last week
- Various test fonts (OpenType, OpenType with TrueType GX variation extensions, Multiple Master) for testing implementations of font format…☆11Jun 25, 2025Updated 9 months ago
- Code repository for the WWW 2019 paper "Predicting ConceptNet Path Quality Using Crowdsourced Assessments of Naturalness"☆12Feb 1, 2019Updated 7 years ago
- openFrameworks addon for drawing fonts using signed distance functions (SDF)☆13Jul 16, 2018Updated 7 years ago
- ☆45Mar 8, 2026Updated 2 weeks ago
- An open-source NLP library: fast text cleaning and preprocessing☆23Nov 9, 2021Updated 4 years ago
- Small string compression using smaz compression algorithm. Fast, because it's in C. Supports Python 3+☆13Oct 18, 2025Updated 5 months ago
- Visual and Embodied Concepts evaluation benchmark☆21Oct 10, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Class repository for Generative Typography workshop (Type@Cooper Spring 2022)☆16Apr 6, 2022Updated 3 years ago
- This package supports implementation of anchor-based topic modeling and variants of the anchoring algorithm in Python 3.☆15Sep 17, 2018Updated 7 years ago
- This repository contains source code to binarize any real-value word embeddings into binary vectors.☆48Jan 7, 2021Updated 5 years ago
- ThoughtTreasure commonsense knowledge base and architecture for natural language processing☆79Jul 31, 2015Updated 10 years ago
- GQR, a Fast Reasoner for Binary Qualitative Constraint Calculi☆19Nov 11, 2017Updated 8 years ago
- BitTorrent Tracker☆31Oct 22, 2009Updated 16 years ago
- Python Vector Graphics☆15May 28, 2016Updated 9 years ago
- Tools for working with types where a subset of values has a total order, like e.g. floats without NaN☆13Nov 7, 2025Updated 4 months ago
- Build an EC2 Container Service cluster for running Docker containers.☆10May 30, 2020Updated 5 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- rtpmidi package from the Scenic project: https://github.com/sat-metalab/scenic☆10Oct 27, 2015Updated 10 years ago
- DEPRECATED Export Members of a Facebook Group to a CSV☆13Jun 30, 2020Updated 5 years ago
- Multicoloure, a SVGinOT/OpenType-SVG color font TTF & WOFF based on Multicolore Vector Typeface☆18Jun 7, 2016Updated 9 years ago
- RDF Community Discussions. Ask anything here!☆13Apr 11, 2024Updated last year
- Source Devanagari Sans☆20Oct 28, 2025Updated 4 months ago
- GlyphsApp Scripts☆11Aug 15, 2023Updated 2 years ago
- Neural Unification for Logic Reasoning over Language☆22Nov 15, 2021Updated 4 years ago