h9-tect / llama2-qlora-finetunined-ArabicLinks
☆10Updated last year
Alternatives and similar repositories for llama2-qlora-finetunined-Arabic
Users that are interested in llama2-qlora-finetunined-Arabic are comparing it to the libraries listed below
Sorting:
- Code, models, and data for "Advancements in Arabic Grammatical Error Detection and Correction: An Empirical Investigation". EMNLP 2023.☆16Updated 9 months ago
- Instruction dataset for Arabic with 10,000 instruction and output pairs. CIDAR can be used to fine-tune LLMs to follow instructions.☆40Updated 2 months ago
- This is the official repository for Peacock: A Family of Arabic Multimodal Large Language Models and Benchmarks.☆25Updated 5 months ago
- ☆22Updated 3 months ago
- The Arabic Error Type Annotation tool aims to annotate Arabic error types following the ALC tagset annotation.☆9Updated 2 years ago
- ☆11Updated 4 months ago
- A Multilingual Replicable Instruction-Following Model☆93Updated last year
- Vocabulary Trimming (VT) is a model compression technique, which reduces a multilingual LM vocabulary to a target language by deleting ir…☆38Updated 7 months ago
- Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback☆96Updated last year
- This repository is the implementation of a Transformer model called MarianCG which is developed for the Code Generation problem.☆21Updated 2 years ago
- Synthetic Data Generation for Evaluation☆14Updated 3 months ago
- ☆40Updated last month
- ☆21Updated 11 months ago
- Aranizer: A Custom Tokenizer based on SentencePiece and BPE tailored for Arabic Language Modeling☆20Updated 10 months ago
- ☆124Updated last year
- A simple semi-supervised approach for creating huggingface data script loaders and upload to the hub.☆11Updated 11 months ago
- Arabic Tokenization Library. It provides many tokenization algorithms.☆104Updated last year
- NTREX -- News Test References for MT Evaluation☆83Updated last year
- Arabic deep-learning based diacritization models (Shakkala, Shakkelha) ported to PyTorch☆14Updated 2 years ago
- ☆27Updated 3 weeks ago
- SIB-200: A Simple, Inclusive, and Big Evaluation Dataset for Topic Classification in 200+ Languages and Dialects☆21Updated 4 months ago
- Code for Zero-Shot Tokenizer Transfer☆128Updated 4 months ago
- ☆27Updated 8 months ago
- Code for Arabic Nougat☆42Updated 6 months ago
- This repository contains the Arabic sarcasm dataset (ArSarcasm)☆24Updated 4 years ago
- Fine-tuning Open-Source LLMs for Adaptive Machine Translation☆79Updated 2 weeks ago
- Arabic cleaning, normalization and segmentation library.☆69Updated last year
- UBC ARBERT and MARBERT Deep Bidirectional Transformers for Arabic☆108Updated 3 years ago
- Code and models for "The Interplay of Variant, Size, and Task Type in Arabic Pre-trained Language Models". EACL 2021, WANLP.☆47Updated 11 months ago
- Python intefrace for evaluation on chatgpt models☆19Updated last year