A flexible sentence segmentation library using CRF model and regex rules
β31Apr 16, 2026Updated last month
Alternatives and similar repositories for sentsplit
Users that are interested in sentsplit are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Korean large emotion labeled dataset (EmoNSMC)β14Mar 5, 2020Updated 6 years ago
- π Official code and dataset for our CCGPK@COLING 2022 paper - "PersonaChatGen: Generating Personalized Dialogue using GPT-3"β13Mar 26, 2024Updated 2 years ago
- Code and data for Koo et al's ACL 2024 paper "Benchmarking Cognitive Biases in Large Language Models as Evaluators"β23Feb 16, 2024Updated 2 years ago
- Official code and dataset for our EMNLP 2024 Findings paper: Stark: Social Long-Term Multi-Modal Conversation with Persona Commonsense Knβ¦β19Dec 27, 2024Updated last year
- A simple Scrapy script for crawling Reuters news articles (Python 3)β14Jan 17, 2018Updated 8 years ago
- Managed Kubernetes at scale on DigitalOcean β’ AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Data for EMNLP 2022 paper "arXivEdits: Understanding the Human Revision Process in Scientific Writing".β14Sep 30, 2023Updated 2 years ago
- Korean Moview Review Emotion (KMRE) Datasetβ21Sep 7, 2020Updated 5 years ago
- A Python library to query a player's overwatch stats from Battle.netβ13Nov 12, 2018Updated 7 years ago
- Google's Conceptual Captions Dataset translated into Koreanβ23Aug 28, 2022Updated 3 years ago
- MeCab model trained with OpenKorPos.β23Jun 19, 2022Updated 3 years ago
- A python binding for mecab-koβ110Jul 14, 2024Updated last year
- β11Oct 3, 2021Updated 4 years ago
- λͺ¨λμ λ§λμΉ λ°μ΄ν°λ₯Ό λΆμμ νΈλ¦¬ν ννλ‘ λ³ννλ κΈ°λ₯μ μ 곡ν©λλ€.β11Mar 2, 2022Updated 4 years ago
- Machine Generated Captions for Best Artworksβ22Sep 21, 2022Updated 3 years ago
- AI Agents on DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- β14Dec 9, 2021Updated 4 years ago
- This is project for korean auto spacingβ12Aug 3, 2020Updated 5 years ago
- β10Oct 21, 2022Updated 3 years ago
- Repository for DISRPT2021 shared taskβ16Sep 5, 2022Updated 3 years ago
- λ¬Έμ₯λ¨μλ‘ λΆμ λ νκ΅μ΄ μν€νΌλμ μ½νΌμ€. Releasesμμ λ€μ΄λ‘λ λ°κ±°λ tfds-koreanμΌλ‘ μ¬μ©ν΄μ£ΌμΈμ.β24Sep 6, 2023Updated 2 years ago
- Official repository for Automated Learning Rate Scheduler for Large-Batch Training (8th ICML Workshop on AutoML)β39Dec 3, 2021Updated 4 years ago
- Implementation for the paper "Fictitious Synthetic Data Can Improve LLM Factuality via Prerequisite Learning"β11Jan 10, 2025Updated last year
- A Translation Task using TurboTransformersβ10Dec 17, 2020Updated 5 years ago
- Official code for our COLING 2022 paper: In-Context Learning for Empathetic Dialogue Generationβ20Mar 1, 2023Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer β’ AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Repo for the LREC 2022 paper The Project Dialogism Novel Corpus: A Dataset for Quotation Attribution in Literary Texts.β14Jul 27, 2022Updated 3 years ago
- ACL 2021 paper "Style is NOT a single variable: Case Studies for Cross-Style Language Understanding " by Dongyeop Kang and Eduard Hovyβ15Jul 19, 2021Updated 4 years ago
- PropSegmEnt is an annotated dataset for segmenting English text into propositions, and recognizing proposition-level entailment relationsβ¦β21Dec 21, 2022Updated 3 years ago
- An open-source online generative dictionaryβ13May 29, 2022Updated 4 years ago
- β62Aug 2, 2023Updated 2 years ago
- Convert Numerical Representations to Korean Pronunciationβ14Apr 20, 2020Updated 6 years ago
- The official python client library for deeq NLP which is new Korean NLP with DL.β21Aug 2, 2022Updated 3 years ago
- This repository provides the dataset introduced by our WSSTG paperβ13Jul 21, 2019Updated 6 years ago
- β23Oct 30, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI β’ AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- A utility for storing and reading files for Korean LM training πΎβ35Oct 15, 2025Updated 7 months ago
- STRODE: Stochastic Boundary Ordinary Differential Equationβ13Jul 20, 2021Updated 4 years ago
- π¦ νμ΄μ¬ νκΈ μ²λ¦¬ λΌμ΄λΈλ¬λ¦¬. Python Korean Morphological Analyzerβ19Feb 4, 2025Updated last year
- β12Nov 7, 2024Updated last year
- λ무μν€, μν€νΌλμ, λ€μλΈλ‘κ·Έ, ν°μ€ν 리, μ νλΈ, λ€μ΄νΈν ν¬λ‘€λ¬β13Feb 20, 2026Updated 3 months ago
- This is the code for neural-Jacana aligner, and the data for MultiMWA dataset.β20Feb 12, 2023Updated 3 years ago
- Parallel dataset of Korean Questions and Commandsβ60Mar 24, 2023Updated 3 years ago