Sentencepiece based BPE tokenizer for English and Japanese language text.
☆28Apr 4, 2024Updated last year
Alternatives and similar repositories for novelai-tokenizer
Users that are interested in novelai-tokenizer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Japanese instruction data (日本語指示データ)☆24Jul 13, 2023Updated 2 years ago
- GPT2 Byte Pair Encoding implementation in Golang☆25Jul 9, 2025Updated 8 months ago
- Karras et al. (2022) diffusion models for PyTorch☆17Oct 5, 2023Updated 2 years ago
- Crawler and cleaner of data for novelai embedding's training☆21May 22, 2025Updated 10 months ago
- ☆24Dec 15, 2023Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆16Nov 19, 2023Updated 2 years ago
- A magic notepad. δ☆14May 21, 2023Updated 2 years ago
- ☆42Jun 22, 2023Updated 2 years ago
- discord bot using AI to generate images based on discord messages☆11Oct 10, 2023Updated 2 years ago
- ☆33Jul 31, 2024Updated last year
- An extension of cafe_aesthetic for AUTOMATIC1111's Stable Diffusion Web UI☆38Sep 14, 2023Updated 2 years ago
- The repository contains scripts and merge scripts that have been modified to adapt an Alpaca-Lora adapter for LoRA tuning when assuming t…☆18May 24, 2023Updated 2 years ago
- Meta data extraction scripts for images generated with NovelAI's image generation☆37Feb 4, 2026Updated last month
- Annotated Fuman Kaitori Center Corpus☆18Dec 18, 2023Updated 2 years ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- NovelAI 5ch wikiの拡張jsのリポジトリです☆12Oct 16, 2022Updated 3 years ago
- Generic classification model☆10Apr 2, 2025Updated 11 months ago
- LLM構築用の日本語チャットデータセット☆88Jan 23, 2024Updated 2 years ago
- ☆115Sep 18, 2023Updated 2 years ago
- ☆27Mar 30, 2023Updated 2 years ago
- A beamer template mainly for Japanese.☆14Apr 21, 2024Updated last year
- Disambiguate japanese heteronyms☆32Oct 3, 2023Updated 2 years ago
- A re-implementation of Stable-Diffusion using better code pratices with faster and lower-memory usage.☆45Feb 8, 2023Updated 3 years ago
- ☆12Sep 26, 2023Updated 2 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- This is a repository for conversations using OpenAI API (compatible with ChatGPT) or llama.cpp in Stable Diffusion web UI.☆43Mar 19, 2025Updated last year
- ☆15Apr 14, 2024Updated last year
- ☆40Feb 25, 2026Updated last month
- ☆28Nov 23, 2023Updated 2 years ago
- 44100Hz日本語音源に対応させた unofficial vits2-TTS implementation in pytorchです。☆24Sep 1, 2023Updated 2 years ago
- ☆18Sep 29, 2024Updated last year
- ☆15Apr 22, 2023Updated 2 years ago
- A simply Python script to easily grab tags of an image on Danbooru☆10Mar 17, 2023Updated 3 years ago
- Deploy your HPC Cluster on AWS in 20min. with just 1-Click.☆55Oct 29, 2025Updated 4 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Platform and API Agnostic library for powering chatbots☆24Feb 27, 2023Updated 3 years ago
- Webui Extension for customizing Highres. fix and improve details.☆48Oct 16, 2023Updated 2 years ago
- python版日本語意味役割付与システム(ASA)☆22Nov 11, 2022Updated 3 years ago
- Implementation of aspect ratio bucketing for training generative image models as described in: https://blog.novelai.net/novelai-improveme…☆397Sep 27, 2024Updated last year
- alpacaデータセットを日本語化したものです☆86Jun 3, 2023Updated 2 years ago
- Tunneling extension for automatic1111 sd-webui☆87Mar 13, 2024Updated 2 years ago
- Utility scripts for preprocessing Wikipedia texts for NLP☆78Apr 9, 2024Updated last year