tsmatz / huggingface-finetune-japanese
Examples to finetune encoder-only and encoder-decoder transformers for Japanese language in Hugging Face (Oct 2022)
☆15Updated last year
Alternatives and similar repositories for huggingface-finetune-japanese
Users that are interested in huggingface-finetune-japanese are comparing it to the libraries listed below
Sorting:
- A collection of preprocessed datasets and pretrained models for generating paraphrases.☆29Updated 3 years ago
- Pre-training Language Models for Japanese☆49Updated last year
- FRAKE: Fusional Real-time Automatic Keyword Extraction☆21Updated last year
- Annotation meets Large Language Models (ChatGPT, GPT-3 and alike).☆56Updated 2 years ago
- Fast whitespace correction with Transformers☆16Updated last year
- Using short models to classify long texts☆21Updated 2 years ago
- Fine-tune ModernBERT on a large Dataset with Custom Tokenizer Training☆66Updated 3 months ago
- Abstractive and Extractive Text summarization using Transformers.☆83Updated last year
- Japanese LLaMa experiment☆53Updated 5 months ago
- Do Multilingual Language Models Think Better in English?☆41Updated last year
- Code and models used in "MUSS Multilingual Unsupervised Sentence Simplification by Mining Paraphrases".☆98Updated 2 years ago
- Tokenizer POS-Tagger and Dependency-parser with BERT/RoBERTa/DeBERTa/GPT models for Japanese and other languages☆50Updated last month
- ☆31Updated 2 years ago
- Domain-Specific Text Generation for Machine Translation (with LLMs) - scripts and config files for the paper☆16Updated last year
- MAFAND-MT☆55Updated 10 months ago
- The official code for PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization☆156Updated 2 years ago
- Japanese / English Bilingual LLM☆15Updated last week
- A Streamlit app running GPT-2 language model for text classification, built with Pytorch, Transformers and AWS SageMaker.☆39Updated 3 years ago
- ☆14Updated 3 years ago
- BLOOM+1: Adapting BLOOM model to support a new unseen language☆71Updated last year
- ☆17Updated last year
- ☆41Updated last year
- Lightblue LLM Eval Framework: tengu, elyza100, ja-mtbench, rakuda☆12Updated last week
- Using open source LLMs to build synthetic datasets for direct preference optimization☆61Updated last year
- The Business Scene Dialogue corpus☆68Updated 3 years ago
- MobileBERT and DistilBERT for extractive summarization☆89Updated last year
- A collection of various NLP datasets, mainly Indonesia-related languages.☆13Updated 3 years ago
- GrammarTagger — A Neural Multilingual Grammar Profiler for Language Learning☆27Updated 4 years ago
- An example of multilingual machine translation using a pretrained version of mt5 from Hugging Face.☆42Updated 4 years ago
- Efficient few-shot learning with cross-encoders.☆51Updated last year