Lightweight piece tokenization library
☆12Apr 15, 2024Updated last year
Alternatives and similar repositories for curated-tokenizers
Users that are interested in curated-tokenizers are comparing it to the libraries listed below
Sorting:
- Wrapper for the macOS signpost API☆16Apr 24, 2023Updated 2 years ago
- Trying to deconstruct RWKV in understandable terms☆14May 6, 2023Updated 2 years ago
- Modular Rust transformer/LLM library using Candle☆38May 5, 2024Updated last year
- This repository contains source codes for SoftCTC. Original paper can be found here: https://arxiv.org/abs/2212.02135☆19Mar 7, 2023Updated 2 years ago
- Fine-grained sentiment annotations of NoReC☆20Aug 1, 2022Updated 3 years ago
- Generate a SQLite database from Wikipedia & Wikidata dumps.☆36Mar 27, 2024Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMs☆12Nov 14, 2025Updated 3 months ago
- [ICML 2024] Official Repository for the paper "Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models"☆10Jul 19, 2024Updated last year
- A python algorithm to change the pitch of the voice in real time☆13Dec 13, 2020Updated 5 years ago
- Vision-Language Models Toolbox: Your all-in-one solution for multimodal research and experimentation☆12Feb 16, 2025Updated last year
- Plug-and-play document AI with zero-shot models.☆124Feb 16, 2026Updated last week
- ☆39Oct 3, 2022Updated 3 years ago
- (READ ONLY MIRROR) The ProB Model Checker and Animator Plugin for Rodin☆19Updated this week
- RWKV is a RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best …☆10Nov 3, 2023Updated 2 years ago
- ☆15Aug 5, 2025Updated 6 months ago
- Grunt task to get the list of Ext.require dependencies in your application☆32Oct 8, 2016Updated 9 years ago
- CVPR 2023: PAniC-3D, Vtubers dataset downloader☆13Apr 22, 2023Updated 2 years ago
- ☆16Jul 23, 2023Updated 2 years ago
- ☆23Jun 19, 2025Updated 8 months ago
- A pre-commit hook for Pyrefly.☆23Updated this week
- ☆12Oct 13, 2014Updated 11 years ago
- ☆22Dec 23, 2025Updated 2 months ago
- ☆22Dec 11, 2025Updated 2 months ago
- OTOv1-v3, NeurIPS, ICLR, TMLR, DNN Training, Compression, Structured Pruning, Erasing Operators, CNN, Diffusion, LLM☆50Oct 10, 2024Updated last year
- DaCy: The State of the Art Danish NLP pipeline using SpaCy☆100Dec 26, 2024Updated last year
- Template for Python-based data science projects in the Alexandra Institute.☆12Feb 15, 2026Updated 2 weeks ago
- Light and dark variants for Visual Studio Code of the Base16 Grayscale theme by Chris Kempson☆10May 11, 2017Updated 8 years ago
- Enhanced Reverberation As Supervision (ERAS) for unsupervised reverberant speech separation☆15Aug 1, 2024Updated last year
- Service for Sails framework with SMS features [DEAD]☆12Mar 6, 2021Updated 4 years ago
- Kernel objects for scaling and format conversion within VapourSynth☆12Nov 5, 2025Updated 3 months ago
- A Prosody XMPP plug and play server☆11Apr 25, 2024Updated last year
- Keyscan: AI-powered API key scanner for GitHub Gists.☆30Jan 1, 2026Updated 2 months ago
- Flutter ObjectDB listener with reactive store☆14Oct 4, 2018Updated 7 years ago
- ☆11May 29, 2025Updated 9 months ago
- Phonemes and durations labeling based on whisper small☆11Jul 7, 2024Updated last year
- [INTERSPEECH 2024] Official code for VoxSim: A perceptual voice similarity dataset☆12Sep 29, 2025Updated 5 months ago
- Proxify Molotov.tv DRM to share content publicly☆10Jun 24, 2020Updated 5 years ago
- ☆12Apr 26, 2024Updated last year
- ☆11May 11, 2023Updated 2 years ago