tsmatz / huggingface-finetune-japaneseLinks
Examples to finetune encoder-only and encoder-decoder transformers for Japanese language in Hugging Face (Oct 2022)
☆16Updated 2 years ago
Alternatives and similar repositories for huggingface-finetune-japanese
Users that are interested in huggingface-finetune-japanese are comparing it to the libraries listed below
Sorting:
- Comparing M2M and mT5 on a rare language pairs, blog post: https://medium.com/@abdessalemboukil/comparing-facebooks-m2m-to-mt5-in-low-re…☆16Updated 4 years ago
- Japanese / English Bilingual LLM☆28Updated last week
- A series of notebooks demonstrating how to build simple NLP web apps with Gradio and Hugging Face transformers☆44Updated 2 months ago
- ☆22Updated last year
- Instruct LLMs for flat and nested NER. Fine-tuning Llama and Mistral models for instruction named entity recognition. (Instruction NER)☆87Updated last year
- This repository contains the relevant materials for the tutorial "Legal IR and NLP: the History, Challenges, and State-of-the-Art", held …☆41Updated 2 years ago
- Theoretical introduction and examples for natural language processing (Sep 2022)☆33Updated 4 months ago
- Streamlit app to Translate text to or between 50 languages with mBART-50 from Huggingface and Facebook☆25Updated 4 years ago
- An example of multilingual machine translation using a pretrained version of mt5 from Hugging Face.☆43Updated 4 years ago
- HuggingChat like UI in Gradio☆70Updated 2 years ago
- Japanese LLaMa experiment☆54Updated 2 months ago
- A Streamlit app running GPT-2 language model for text classification, built with Pytorch, Transformers and AWS SageMaker.☆39Updated 3 years ago
- ☆32Updated 3 years ago
- Financial Domain Question Answering with pre-trained BERT Language Model☆131Updated 5 months ago
- MAFAND-MT☆60Updated last year
- ☆64Updated 2 years ago
- ☆41Updated last year
- Document Q&A on Wikipedia articles using LLMs☆79Updated 2 years ago
- ☆43Updated last year
- Using short models to classify long texts☆21Updated 2 years ago
- YAST - Yet Another SPLADE or Sparse Trainer☆21Updated 6 months ago
- Stabilize and achieve excellent performance with transformers☆41Updated 3 years ago
- Paraphrasing for academic texts☆14Updated 3 years ago
- Seahorse is a dataset for multilingual, multi-faceted summarization evaluation. It consists of 96K summaries with human ratings along 6 q…☆89Updated last year
- Logical structure analysis for visually structured documents☆95Updated 3 years ago
- Retrieval augmented generation demos with open-source DeepSeek, Llama, Qwen, Mistral, Gemma☆42Updated 4 months ago
- Consists of the largest (10K) human annotated code-switched semantic parsing dataset & 170K generated utterance using the CST5 augmentati…☆41Updated 2 years ago
- Annotation meets Large Language Models (ChatGPT, GPT-3 and alike).☆58Updated 2 years ago
- We finetune Bloomz-7b1-mt using LoRA with the chatdoctor-200k dataset at here https://huggingface.co/LinhDuong/doctorwithbloomz-7b1-mt an…☆30Updated 2 years ago
- Using open source LLMs to build synthetic datasets for direct preference optimization☆71Updated last year