data related codebase for polyglot project
☆19Mar 30, 2023Updated 2 years ago
Alternatives and similar repositories for polyglot-data
Users that are interested in polyglot-data are comparing it to the libraries listed below
Sorting:
- kogpt를 oslo로 파인튜닝하는 예제.☆23Aug 26, 2022Updated 3 years ago
- 🤗 최소한의 세팅으로 LM을 학습하기 위한 샘플코드☆59May 23, 2023Updated 2 years ago
- ☆14Dec 9, 2021Updated 4 years ago
- Calculating Expected Time for training LLM.☆38Apr 17, 2023Updated 2 years ago
- [Findings of NAACL2022] A Dog Is Passing Over The Jet? A Text-Generation Dataset for Korean Commonsense Reasoning and Evaluation☆11May 27, 2022Updated 3 years ago
- Polyglot: Large Language Models of Well-balanced Competence in Multi-languages☆484Aug 22, 2023Updated 2 years ago
- Beyond LM: How can language model go forward in the future?☆15Apr 30, 2023Updated 2 years ago
- OSLO: Open Source for Large-scale Optimization☆175Sep 9, 2023Updated 2 years ago
- 한국어 T5 모델☆54Dec 7, 2021Updated 4 years ago
- Machine Generated Captions for Best Artworks☆22Sep 21, 2022Updated 3 years ago
- 언어모델을 학습하기 위한 공개 한국어 instruction dataset들을 모아두었습니다.☆19Jul 16, 2023Updated 2 years ago
- Yet another python binding for mecab-ko☆88May 16, 2023Updated 2 years ago
- Data processing system for polyglot☆93Sep 5, 2023Updated 2 years ago
- Pure python implementation of DARTS (Double ARray Trie System)☆24Dec 7, 2022Updated 3 years ago
- Korean Named Entity Corpus☆25May 12, 2023Updated 2 years ago
- Alchera AI Competition 2nd Solution (body part segmentation)☆23Dec 7, 2021Updated 4 years ago
- This repo is for Korean wiki table question answering datasets described in the paper of Korean-Specific Dataset for Table Question Answe…☆91Oct 22, 2024Updated last year
- Similar string search in Levenshtein distance☆21Jun 19, 2021Updated 4 years ago
- Korean Commonsense Knowledge Graph☆15Dec 23, 2022Updated 3 years ago
- baikal.ai's pre-trained BERT models: descriptions and sample codes☆12Jun 24, 2021Updated 4 years ago
- KoGPT2 on Huggingface Transformers☆33May 4, 2021Updated 4 years ago
- #인권코퍼스☆31Oct 6, 2023Updated 2 years ago
- Opensource chatbot framework☆16Aug 1, 2021Updated 4 years ago
- RL Implementation☆19May 10, 2022Updated 3 years ago
- 업무자동화를 위한 Python 강의를 듣 고 정리한 자료☆13Oct 10, 2017Updated 8 years ago
- 모두의 말뭉치 데이터를 분석에 편리한 형태로 변환하는 기능을 제공합니다.☆11Mar 2, 2022Updated 4 years ago
- Dataset for the NLPMC @ NAACL 2021 Paper: Assertion Detection in Clinical Notes: Medical Language Models to the Rescue?☆16Sep 28, 2021Updated 4 years ago
- ☆35May 18, 2023Updated 2 years ago
- 거꾸로 읽는 self-supervised learning in NLP☆27Oct 30, 2022Updated 3 years ago
- ☆33Aug 30, 2023Updated 2 years ago
- final-project-level3-nlp-02 created by GitHub Classroom☆11Dec 31, 2021Updated 4 years ago
- [Findings of NAACL2022] A Dog Is Passing Over The Jet? A Text-Generation Dataset for Korean Commonsense Reasoning and Evaluation☆28Dec 9, 2022Updated 3 years ago
- Large-scale language modeling tutorials with PyTorch☆293Nov 2, 2021Updated 4 years ago
- Pecab: Pure python Korean morpheme analyzer based on Mecab☆172Apr 27, 2024Updated last year
- Multidocument Summarization for Literature Review Shared Task 2022☆30Oct 16, 2022Updated 3 years ago
- Simple Tensorflow implementation of "Toward Spatially Unbiased Generative Models" (ICCV 2021)☆15Jan 21, 2022Updated 4 years ago
- 트랜스포머 블록을 활용한 상품명 자연어처리 기반 카테고리 분류 모델☆10Dec 5, 2022Updated 3 years ago
- This repository provides a framework to serve LLM(Large Language Model) based applications such as Chatbot.☆18Apr 20, 2023Updated 2 years ago
- ☆12May 3, 2022Updated 3 years ago