Byte-Pair Encoding (BPE) (subword-based tokenization) algorithm implementaions from scratch with python
☆18Jan 30, 2023Updated 3 years ago
Alternatives and similar repositories for byte_pair_encoding_BPE_subword_tokenization_implementation_python
Users that are interested in byte_pair_encoding_BPE_subword_tokenization_implementation_python are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- WIP rust bindings for Awesomium browser☆11Jun 24, 2016Updated 9 years ago
- ☆18Mar 26, 2015Updated 11 years ago
- Pretty printing for ImmutableJS☆12Jun 15, 2016Updated 9 years ago
- Code for the paper attend, copy, parse - End-to-end information extraction from documents (https://arxiv.org/pdf/1812.07248.pdf)☆13Jun 2, 2022Updated 3 years ago
- Configuration files☆12Jan 21, 2024Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Simple Entity-Component System☆11Apr 21, 2015Updated 11 years ago
- Elm Set built on top of AnyDict☆10Aug 12, 2024Updated last year
- ☆13Jan 20, 2023Updated 3 years ago
- ☆13Jul 17, 2021Updated 4 years ago
- Additional color handling for Elm☆13Mar 7, 2019Updated 7 years ago
- For loops in const☆13Sep 7, 2024Updated last year
- A basic Electron App using elm☆14Oct 4, 2017Updated 8 years ago
- A C++ library implementing fast language models estimation using the 1-Sort algorithm.☆16May 18, 2023Updated 3 years ago
- char <-> Unicode character name (maintained fork of huonw/unicode_names)☆12Sep 7, 2025Updated 8 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Raw rust bindings to the enet C library☆21Mar 16, 2026Updated 2 months ago
- A calculator written in the elm language☆13Oct 31, 2022Updated 3 years ago
- Rust tool to get info from your lycamobile.es account☆10Apr 29, 2021Updated 5 years ago
- Chinese Word Segmentation task based on BERT and implemented in Pytorch☆14Aug 14, 2020Updated 5 years ago
- Basic persistent storage for dokku (https://github.com/progrium/dokku)☆53Aug 31, 2015Updated 10 years ago
- A demo project for using `emulators` to generate screenshots for a Flutter project☆13Feb 24, 2022Updated 4 years ago
- dMel: Speech Tokenization Made Simple☆20May 13, 2025Updated last year
- An offline Rust thesaurus library.☆12Aug 13, 2022Updated 3 years ago
- char <-> Unicode character name☆24Aug 20, 2016Updated 9 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆12Sep 25, 2022Updated 3 years ago
- ☆10Jun 17, 2020Updated 5 years ago
- ☆10Nov 25, 2022Updated 3 years ago
- ☆14Nov 6, 2017Updated 8 years ago
- 开放中文转换 - 简繁转换之通用规范汉字标准☆19May 16, 2026Updated 2 weeks ago
- Additional basic functions for Elm.☆15Feb 23, 2023Updated 3 years ago
- 韩语输入法 RIME IME schema for typing Korean Hangul and Hanja☆12Jul 10, 2020Updated 5 years ago
- A Cantonese-English translator based on prompt engineering☆12Sep 19, 2023Updated 2 years ago
- refinement types for Elm☆16Jul 12, 2023Updated 2 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Github mirror of MediaWiki extension TextExtracts - our actual code is hosted with Gerrit (please see https://www.mediawiki.org/wiki/Dev…☆15May 22, 2026Updated last week
- ☆15Feb 2, 2019Updated 7 years ago
- Create svg path with each point can have variable width.☆17Feb 13, 2025Updated last year
- Loengfan (粵語兩分) is the Cantonese version of the Liang Fen input method☆15Mar 3, 2022Updated 4 years ago
- Write JSON decoders in Elm using continuation-style.☆16Apr 11, 2023Updated 3 years ago
- Modern and elegant test framework for Flutter, inspired by Cypress☆18May 4, 2022Updated 4 years ago
- A dropdown component for Elm☆12Jan 28, 2019Updated 7 years ago