ndl-lab / layout-dataset
NDL-DocLデータセット(資料画像レイアウトデータセット)
☆26Updated last year
Related projects ⓘ
Alternatives and complementary repositories for layout-dataset
- デジタル化資料OCRテキスト化事業において作成されたOCR学習用データセット☆64Updated 4 months ago
- Japanese-BPEEncoder☆39Updated 3 years ago
- ☆19Updated last year
- Japanese Realistic Textual Entailment Corpus (NLP 2020, LREC 2020)☆76Updated last year
- This repository has implementations of data augmentation for NLP for Japanese.☆64Updated last year
- Japanese synonym library☆52Updated 2 years ago
- 日本語CLIPモデル☆13Updated last year
- Japanese tokenizer for Transformers☆78Updated 10 months ago
- Mecab + NEologd + Docker + Python3☆35Updated 2 years ago
- ☆31Updated 3 months ago
- ☆18Updated last month
- IPAdic packaged for easy use from Python.☆25Updated 3 years ago
- JMultiWOZ: A Large-Scale Japanese Multi-Domain Task-Oriented Dialogue Dataset☆22Updated 7 months ago
- Japanese CLIP by rinna Co., Ltd.☆68Updated 11 months ago
- 図表自動抽出のプログラム(A program that automatically extracts diagrams)☆19Updated 3 years ago
- ☆16Updated 3 years ago
- Easily turn large English text datasets into Japanese text datasets using open LLMs.☆13Updated last week
- Kyoto University Text Corpus☆59Updated last year
- ☆15Updated 11 months ago
- DistilBERT model pre-trained on 131 GB of Japanese web text. The teacher model is BERT-base that built in-house at LINE.☆43Updated last year
- 日本語T5モデル☆112Updated last month
- ☆24Updated this week
- ☆13Updated 2 months ago
- ☆82Updated last year
- ☆50Updated last year
- aMLP Transformer Model for Japanese☆15Updated 2 years ago
- 【2024年版】BERTによるテキスト分類☆23Updated 4 months ago
- ☆18Updated 5 months ago
- Code for COLING 2020 Paper☆13Updated this week
- RealPersonaChat: A Realistic Persona Chat Corpus with Interlocutors' Own Personalities☆48Updated 7 months ago