デジタル化資料OCRテキスト化事業において作成されたOCR学習用データセット
☆83Jun 26, 2024Updated last year
Alternatives and similar repositories for pdmocrdataset-part1
Users that are interested in pdmocrdataset-part1 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- NDL-DocLデータセット(資料画像レイアウトデータセット)☆30Mar 2, 2023Updated 3 years ago
- NDLOCRアプリケーションのリポジトリ(ソースコードを含む)☆661Jan 5, 2026Updated 3 months ago
- NDL古典籍OCR学習用データセット(みんなで翻刻加工データ)☆20Mar 13, 2026Updated last month
- 文字画像データセット(平仮名73文字版)☆18Apr 6, 2020Updated 6 years ago
- デジタル化資料から作成したOCRテキストデータのngram頻度統計情報のデータセット☆15Jan 10, 2023Updated 3 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- LEIA: Facilitating Cross-Lingual Knowledge Transfer in Language Models with Entity-based Data Augmentation☆23Apr 24, 2024Updated last year
- OCR処理プログラム研究開発事業において作成されたOCR学習用データセット☆15Jun 26, 2024Updated last year
- PDFからテキストデータを抽出して機械学習等に適用するためのツール群☆12Aug 4, 2021Updated 4 years ago
- ☆30Apr 10, 2025Updated last year
- ☆18Feb 9, 2025Updated last year
- 次世代デジタルライブラリーのソースコード(Programs of the Next Digital Library.)☆26Apr 27, 2023Updated 2 years ago
- Show notes for https://anchor.fm/yoheikikuta.☆15Apr 24, 2022Updated 3 years ago
- Google Chromeの内蔵ローカルLLMでチャットするためのサンプルコードです。☆13Jan 15, 2025Updated last year
- Code and documentation to train Stanford's Alpaca models, and generate the data.☆24Mar 19, 2023Updated 3 years ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Ono laboratory audio signal processing exercise for beginners.☆19May 10, 2023Updated 2 years ago
- NLP2025 のチュートリアル「地理情報と言語処理 実践入門」の資料とソースコード☆17Updated this week
- NDL古典籍OCRのアプリケーション(ソースコードを含む)☆94Oct 14, 2025Updated 6 months ago
- 音声を文字起こししてChatGPTと会話したい☆22Mar 8, 2023Updated 3 years ago
- ☆19Mar 12, 2026Updated last month
- RealPersonaChat: A Realistic Persona Chat Corpus with Interlocutors' Own Personalities☆63Mar 13, 2024Updated 2 years ago
- Japanese BERT Pretrained Model☆23Nov 13, 2021Updated 4 years ago
- ☆32Apr 8, 2026Updated last week
- text-only archives of www.aozora.gr.jp☆91Mar 22, 2023Updated 3 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Mecab + NEologd + Docker + Python3☆36May 10, 2022Updated 3 years ago
- ☆22Sep 18, 2023Updated 2 years ago
- 🛥 Vaporetto: Very accelerated pointwise prediction based tokenizer☆256Feb 7, 2026Updated 2 months ago
- 進捗大陸で使用されたSATySFiファイル☆12May 22, 2023Updated 2 years ago
- Wio Terminal で L チカする☆21Feb 15, 2021Updated 5 years ago
- General-purpose Swich transformer based Japanese language model☆118Sep 13, 2023Updated 2 years ago
- Unofficial browser extension for Scrapbox☆30Jul 31, 2022Updated 3 years ago
- Japanese-BPEEncoder☆41Sep 12, 2021Updated 4 years ago
- ☆16Nov 19, 2023Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆15Nov 30, 2023Updated 2 years ago
- Easily turn large English text datasets into Japanese text datasets using open LLMs.☆28Jan 20, 2025Updated last year
- ☆29Apr 8, 2026Updated last week
- RDF data for Knowledge Graph Reasoning Challenge.☆21Feb 28, 2025Updated last year
- ☆12Dec 12, 2019Updated 6 years ago
- Pre-train Embedding in LightFM Recommender System Framework☆11Apr 28, 2019Updated 6 years ago
- csvt is a command line tool for processing CSV.☆13Jan 14, 2026Updated 3 months ago