Python script for manipulating the existing tokenizer.
☆21Mar 6, 2026Updated 2 months ago
Alternatives and similar repositories for Tokenizer-Changer
Users that are interested in Tokenizer-Changer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Beyond KV Caching: Shared Attention for Efficient LLMs☆20Jul 19, 2024Updated last year
- Advanced Formal Language Theory (263-5352-00L; Frühjahr 2023)☆10Feb 21, 2023Updated 3 years ago
- MAchine Translation Evaluation Online (MATEO)☆26Jun 2, 2025Updated 11 months ago
- Efficient Symptom Inquiring and Diagnosis via Adaptive Alignment of Reinforcement Learning and Classification [AI in Medicine Journal]☆12May 20, 2022Updated 3 years ago
- Repository for Sparse Universal Transformers☆20Oct 23, 2023Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Scripts for data generation, scoring and data manifest preparation for CHiME-8 DASR task.☆24Feb 25, 2025Updated last year
- ☆25Apr 3, 2025Updated last year
- String Distance using cython☆13Jan 19, 2020Updated 6 years ago
- Code for the paper "Greed is All You Need: An Evaluation of Tokenizer Inference Methods"☆13Nov 26, 2024Updated last year
- Facilitates Visual Representation of Sign Language Data and Glosses☆19May 16, 2025Updated 11 months ago
- A different, but useful, textcat approach.☆18Jul 15, 2024Updated last year
- FlowMirror-HydraVox — A natively accelerated multi-head autoregressive TTS system derived from CosyVoice 3.0. It predicts multiple tokens…☆49Feb 17, 2026Updated 2 months ago
- ☆141Apr 8, 2026Updated last month
- L&S 88-5 Connector Course to Data 8☆15Apr 12, 2018Updated 8 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- NOAH's Corpus: Part-of-Speech Tagging for Swiss German☆12Jan 6, 2023Updated 3 years ago
- Python port for IWNLP.Lemmatizer☆19Apr 13, 2026Updated 3 weeks ago
- AMR-to-text Generation with Graph Transformer☆18Nov 16, 2020Updated 5 years ago
- Implementation of "Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation"☆27Mar 4, 2021Updated 5 years ago
- Nanyang Technological University - Multilingual Corpus (STB subcorpora)☆12Mar 11, 2019Updated 7 years ago
- NOTSOFAR-1 Challenge: Distant Diarization and ASR☆60Feb 12, 2025Updated last year
- Do Multilingual Language Models Think Better in English?☆42Aug 3, 2023Updated 2 years ago
- This is a Python and Tensorflow implementation of Posenet v2 released by Google in TensorflowJS.☆26Jan 20, 2020Updated 6 years ago
- A new benchmark of 118 ICPC problems for evaluating LLM reasoning in competitive coding, featuring realistic ICPC competition scenario, r…☆17May 18, 2025Updated 11 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Muon fsdp 2☆56Aug 8, 2025Updated 9 months ago
- Opus builds a tank game☆23Jan 19, 2026Updated 3 months ago
- 中国人民大学 YOJ 题库☆12Jun 9, 2022Updated 3 years ago
- 深圳中学议事会线上表决系统 / Online voting platform for Shenzhen Middle School Student Council☆12Feb 7, 2018Updated 8 years ago
- ☆16Aug 14, 2023Updated 2 years ago
- German lemmatization with IWNLP as extension for spaCy☆27Apr 13, 2026Updated 3 weeks ago
- Adding random noise to a text dataset, and controlling very accurately the quality of the result☆20Apr 13, 2026Updated 3 weeks ago
- Contrastive Chain-of-Thought Prompting☆69Nov 18, 2023Updated 2 years ago
- Repository containing common Makefiles for setting up conda environments.☆10Feb 10, 2023Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Sparse Attention with Linear Units☆20Apr 21, 2021Updated 5 years ago
- [SIGIR 2024] This is the official PyTorch implementation for the paper: "EulerFormer: Sequential User Behavior Modeling with Complex Vect…☆17Oct 5, 2024Updated last year
- [NAACL 2024] Making Language Models Better Tool Learners with Execution Feedback☆43Mar 14, 2024Updated 2 years ago
- First instruction-tuning dataset distilled from Claude2 (52k Alpaca prompts)!☆13Oct 22, 2023Updated 2 years ago
- Open-Source Turn-Taking Detection Model and Dataset for Full-Duplex Spoken Dialogue Systems☆103Jan 25, 2026Updated 3 months ago
- 💾A moleculer service mixin for minio and S3 💾☆15Sep 16, 2022Updated 3 years ago
- ☆106Mar 1, 2026Updated 2 months ago