kevinkrahn / ancient-greek-datasets
Datasets for training and evaluating Ancient Greek sentence embedding models
☆11Updated 9 months ago
Alternatives and similar repositories for ancient-greek-datasets:
Users that are interested in ancient-greek-datasets are comparing it to the libraries listed below
- ☆27Updated 7 months ago
- Latin BERT☆60Updated 10 months ago
- A context-aware embedding similarity score☆11Updated last year
- File format, model, API, and apps for manipulating text and its annotated features☆70Updated last week
- Ancient Greek language models for spaCy☆28Updated last month
- Morpheus morphological analysis engine☆12Updated last year
- ☆67Updated last year
- ☆21Updated 3 months ago
- Layout Analysis Dataset with Segmonto (LADaS)☆20Updated 2 months ago
- Python library for generating (and analyzing) Ancient Greek inflectional paradigms☆42Updated 2 years ago
- Pre-trained BERT Models for Ancient and Medieval Greek, and associated code for LaTeCH 2021 paper titled - "A Pilot Study for BERT Langua…☆38Updated 3 years ago
- Syntax trees, morphology, and linguistic annotations for the Greek Bible☆26Updated last month
- In-browser OCR of Ancient Greek and Latin☆26Updated this week
- XML files for the works in the First Thousand Years of Greek Project. Please see our Wiki on how to contribute.☆94Updated 3 months ago
- The APAC AI Hub for Documents, Product Briefs, Plans, and SOPS, we're currently raising SAFE 100M$ at a 1Billion$ valuation.☆11Updated last year
- Small python package to measure OCR quality and other related metrics.☆21Updated last year
- Download, parse, and filter data from Phil Papers. Data-ready for The-Pile.☆15Updated last year
- Structured data for classical studies☆19Updated 8 years ago
- Contains stuff for the Hebrew/Syriac morphology project of the Escience center/ETCBC☆12Updated 3 months ago
- BPE modification that implements removing of the intermediate tokens during tokenizer training.☆25Updated 5 months ago
- 🚀 Automatically convert unstructured data into a high-quality 'textbook' format, optimized for fine-tuning Large Language Models (LLMs)☆26Updated last year
- AgentParse is a high-performance parsing library designed to map various structured data formats (such as Pydantic models, JSON, YAML, an…☆13Updated this week
- Programming for Historians☆15Updated 2 years ago
- Hebrew Bible + Linguistic annotations in text-fabric format. Fixed and ongoing versions.☆53Updated 4 months ago
- A general-purpose NLP pipeline for Ancient Greek☆22Updated last year
- A swarm of LLM agents that will help you test, document, and productionize your code!☆15Updated last week
- ☆37Updated 7 years ago
- A forest of autonomous agents.☆19Updated 2 months ago
- A collection of computer tools for aiding the text critical workflow from transcription to collation to analysis.☆22Updated 3 weeks ago
- Modified Beam Search with periodical restart☆12Updated 7 months ago