Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.
☆59Jul 1, 2021Updated 4 years ago
Alternatives and similar repositories for exquisite-corpus
Users that are interested in exquisite-corpus are comparing it to the libraries listed below
Sorting:
- Access a database of word frequencies, in various natural languages.☆1,633Jan 4, 2025Updated last year
- JavaScript port of SymSpell for Node.js☆13Sep 30, 2022Updated 3 years ago
- Loads OpenSubtitles v2018 dataset without having to load everything into memory at once. Works well with pytorch.☆13Aug 26, 2020Updated 5 years ago
- Pocketsphinx-based Linux Voice Dictation☆25Jun 12, 2020Updated 5 years ago
- ☆27Oct 22, 2012Updated 13 years ago
- Tools for indexing gzip files to support random-like access.☆28Mar 15, 2021Updated 4 years ago
- Machine translation for the real world☆23Jan 22, 2020Updated 6 years ago
- Serapis is a sentence identifier and modeling pipeline / built for Wordnik☆24Jun 9, 2016Updated 9 years ago
- Some tools for variable fonts☆26Jun 29, 2025Updated 8 months ago
- ☆10Sep 29, 2022Updated 3 years ago
- A Python library for reading the YXDB file format☆11Jul 11, 2024Updated last year
- Python library for CJK (Chinese, Japanese, and Korean) language dictionary☆93Updated this week
- Transform Oracle PL/SQL Code to Python☆11Oct 26, 2013Updated 12 years ago
- ☆13Feb 11, 2020Updated 6 years ago
- Daily archive of YLE Selkouutiset. Updates every day around midnight.☆12Updated this week
- console version of StarDict formerly used by koreader; fork of https://github.com/Dushistov/sdcv☆10Aug 29, 2017Updated 8 years ago
- PwnHub is a CTF collaboration platform written in Bash, originally built as a response to a joke about the Bash Stack by yousuckatprogram…☆20Jun 13, 2025Updated 8 months ago
- Functional composable pipelines allowing clean separation of the business logic and its implementation☆11Sep 6, 2025Updated 6 months ago
- ArcGIS Python toolbox for automated placement of supplementary contour lines☆10Sep 17, 2020Updated 5 years ago
- Various test fonts (OpenType, OpenType with TrueType GX variation extensions, Multiple Master) for testing implementations of font format…☆11Jun 25, 2025Updated 8 months ago
- All remy's bins☆14Jan 29, 2026Updated last month
- SQL Query Tool for Static Files☆104Jun 19, 2014Updated 11 years ago
- Send your shell preferences to a shared account for an SSH session.☆22Mar 5, 2015Updated 11 years ago
- Technical Analysis Library☆12Aug 25, 2012Updated 13 years ago
- Embed and execute SQL in Markdown☆13Mar 21, 2024Updated last year
- A reflection-based JSON (de)serialization library written in and for F#☆14Nov 19, 2020Updated 5 years ago
- Simple single-file FUSE implementation of copy-on-write☆10Aug 14, 2014Updated 11 years ago
- ☆10Nov 12, 2023Updated 2 years ago
- Graphics pipeline tool for Neo Geo development☆12Nov 3, 2024Updated last year
- Main repository of the Flint project for Spark and Amazon EMR.☆11Jan 31, 2020Updated 6 years ago
- Examples of using common Python profiling techniques☆12May 24, 2019Updated 6 years ago
- An Astro template for your iOS apps.☆12Mar 7, 2023Updated 3 years ago
- Read Safari's Bookmarks.plist☆16May 26, 2023Updated 2 years ago
- ☆23Updated this week
- GlyphsApp Scripts☆11Aug 15, 2023Updated 2 years ago
- Ollie is a open information extractor that uses dependency parses.☆12Sep 27, 2013Updated 12 years ago
- It's All Text! Editor Daemon☆17Nov 21, 2019Updated 6 years ago
- A simple Glyphs App plugin to find words that contain the selected glyphs.☆11Sep 18, 2024Updated last year
- 🔒📈 Host file tools written in rust.☆15Dec 23, 2025Updated 2 months ago