☆43Mar 30, 2024Updated 2 years ago
Alternatives and similar repositories for shisa
Users that are interested in shisa are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆24Dec 15, 2023Updated 2 years ago
- Utility scripts for preprocessing Wikipedia texts for NLP☆78Apr 9, 2024Updated 2 years ago
- Japanese / English Bilingual LLM☆30Dec 23, 2025Updated 5 months ago
- The robust text processing pipeline framework enabling customizable, efficient, and metric-logged text preprocessing.☆126Apr 10, 2026Updated last month
- Japanese instruction data (日本語指示データ)☆24Jul 13, 2023Updated 2 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- DIRECT: Direct and Indirect REsponses in Conversational Text Corpus☆17Jul 1, 2021Updated 4 years ago
- YAST - Yet Another SPLADE or Sparse Trainer☆21Jun 16, 2025Updated 11 months ago
- A framework for few-shot evaluation of autoregressive language models.☆154Sep 13, 2024Updated last year
- Preferred Generation Benchmark☆94Mar 6, 2026Updated 2 months ago
- Easily turn large English text datasets into Japanese text datasets using open LLMs.☆29Jan 20, 2025Updated last year
- LaTeX document class for the proceedings of ANLP☆21Oct 28, 2025Updated 7 months ago
- JMultiWOZ: A Large-Scale Japanese Multi-Domain Task-Oriented Dialogue Dataset, LREC-COLING 2024☆25Mar 27, 2024Updated 2 years ago
- Trials of pre-trained BERT models for the medical domain in Japanese.☆13Nov 21, 2020Updated 5 years ago
- ☆21Jan 11, 2023Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆33Jul 31, 2024Updated last year
- ☆17Apr 11, 2024Updated 2 years ago
- LLM構築用の日本語チャットデータセット☆87Jan 23, 2024Updated 2 years ago
- JQaRA: Japanese Question Answering with Retrieval Augmentation - 検索拡張(RAG)評価のための日本語Q&Aデータセット☆44Sep 9, 2025Updated 8 months ago
- Scripts for creating a Japanese-English parallel corpus and training NMT models☆18Nov 9, 2021Updated 4 years ago
- JGLUE: Japanese General Language Understanding Evaluation☆342Mar 31, 2025Updated last year
- Training and evaluation scripts for JGLUE, a Japanese language understanding benchmark☆18May 20, 2026Updated last week
- ☆16Nov 19, 2023Updated 2 years ago
- ☆24Dec 26, 2023Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Pre-training Language Models for Japanese☆50Jul 2, 2023Updated 2 years ago
- Japanese LLaMa experiment☆54Dec 27, 2025Updated 5 months ago
- 日本語マルチタスク言語理解ベンチマーク Japanese Massive Multitask Language Understanding Benchmark☆40Oct 7, 2025Updated 7 months ago
- ⚡Japanese sentence splitting(日本語文境界判定器), 40–250× faster via a Rust-accelerated Python library with near-perfect API compatibility with …☆74Oct 14, 2025Updated 7 months ago
- EasyLightChatAssistant は軽量で検閲や規制のないローカル日本語モデルのLightChatAssistant を、KoboldCpp で簡単にお試しする環境です。☆45Apr 23, 2024Updated 2 years ago
- 敬語変換タスクにおける評価用データセット☆21Nov 24, 2022Updated 3 years ago
- ☆44Apr 10, 2025Updated last year
- ☆31Apr 21, 2023Updated 3 years ago
- Mixtral-based Ja-En (En-Ja) Translation model☆20Jan 6, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.☆13Jun 7, 2023Updated 2 years ago
- Benchmark for Japanese document embedding & vector search☆29Mar 12, 2024Updated 2 years ago
- 法律・判例関係のデータセット☆52Jan 8, 2025Updated last year
- ☆50Apr 10, 2024Updated 2 years ago
- Code to train Sentence BERT Japanese model for Hugging Face Model Hub☆11Aug 8, 2021Updated 4 years ago
- 🛥 Vaporetto is a fast and lightweight pointwise prediction based tokenizer. This is a Python wrapper for Vaporetto.☆21Jun 1, 2025Updated 11 months ago
- A project for self-implementation of deep learning on FPGAs☆17Aug 24, 2020Updated 5 years ago