デジタル化資料OCRテキスト化事業において作成されたOCR学習用データセット
☆83Jun 26, 2024Updated last year
Alternatives and similar repositories for pdmocrdataset-part1
Users that are interested in pdmocrdataset-part1 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- NDL-DocLデータセット(資料画像レイアウトデータセット)☆30Mar 2, 2023Updated 3 years ago
- NDLOCRアプリケーションのリポジトリ(ソースコードを含む)☆671Jan 5, 2026Updated 4 months ago
- NDL古典籍OCR学習用データセット(みんなで翻刻加工データ)☆20Mar 13, 2026Updated last month
- 文字画像データセット(平仮名73文字版)☆18Apr 6, 2020Updated 6 years ago
- デジタル化資料から作成したOCRテキストデータのngram頻度統計情報のデータセット☆15Jan 10, 2023Updated 3 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- LEIA: Facilitating Cross-Lingual Knowledge Transfer in Language Models with Entity-based Data Augmentation☆23Apr 24, 2024Updated 2 years ago
- OCR処理プログラム研究開発事業において作成されたOCR学習用データセット☆15Jun 26, 2024Updated last year
- 図表自動抽出のプログラム(A program that automatically extracts diagrams)☆19Aug 4, 2021Updated 4 years ago
- PDFからテキストデータを抽出して機械学習等に適用するためのツール群☆12Aug 4, 2021Updated 4 years ago
- ☆18Feb 9, 2025Updated last year
- 次世代デジタルライブラリーのソースコード(Programs of the Next Digital Library.)☆26Apr 30, 2026Updated last week
- Show notes for https://anchor.fm/yoheikikuta.☆15Apr 24, 2022Updated 4 years ago
- Google Chromeの内蔵ローカルLLMでチャットするためのサンプルコードです。☆13Jan 15, 2025Updated last year
- Code and documentation to train Stanford's Alpaca models, and generate the data.☆24Mar 19, 2023Updated 3 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Ono laboratory audio signal processing exercise for beginners.☆19May 10, 2023Updated 2 years ago
- NLP2025 のチュートリアル「地理情報と言語処理 実践入門」の資料とソースコード☆17Updated this week
- 解析が難しい日本の住所のテストデータセット☆14Sep 25, 2023Updated 2 years ago
- NDL古典籍OCRのアプリケーション(ソースコードを含む)☆97Oct 14, 2025Updated 6 months ago
- 音声を文字起こししてChatGPTと会話したい☆22Mar 8, 2023Updated 3 years ago
- ☆19Mar 12, 2026Updated last month
- RealPersonaChat: A Realistic Persona Chat Corpus with Interlocutors' Own Personalities☆64Mar 13, 2024Updated 2 years ago
- Japanese BERT Pretrained Model☆23Nov 13, 2021Updated 4 years ago
- ☆40May 2, 2026Updated last week
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- 鴨川って快活CLUBだ☆16Jan 24, 2023Updated 3 years ago
- text-only archives of www.aozora.gr.jp☆91Mar 22, 2023Updated 3 years ago
- Mecab + NEologd + Docker + Python3☆36May 10, 2022Updated 3 years ago
- Wikipediaを用いた日本語の固有表現抽出データセット☆142Sep 2, 2023Updated 2 years ago
- ☆22Sep 18, 2023Updated 2 years ago
- 🛥 Vaporetto: Very accelerated pointwise prediction based tokenizer☆287Feb 7, 2026Updated 3 months ago
- Wio Terminal で L チカする☆21Feb 15, 2021Updated 5 years ago
- General-purpose Swich transformer based Japanese language model☆118Sep 13, 2023Updated 2 years ago
- Unofficial browser extension for Scrapbox☆30Jul 31, 2022Updated 3 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆15Nov 20, 2025Updated 5 months ago
- Japanese-BPEEncoder☆41Sep 12, 2021Updated 4 years ago
- ☆16Nov 19, 2023Updated 2 years ago
- RDF data for Knowledge Graph Reasoning Challenge.☆21Feb 28, 2025Updated last year
- ☆16Nov 30, 2023Updated 2 years ago
- Easily turn large English text datasets into Japanese text datasets using open LLMs.☆29Jan 20, 2025Updated last year
- ☆29Apr 28, 2026Updated last week