MultilingualSIFT: Multilingual Supervised Instruction Fine-tuning
☆95Aug 15, 2023Updated 2 years ago
Alternatives and similar repositories for MultilingualSIFT
Users that are interested in MultilingualSIFT are comparing it to the libraries listed below
Sorting:
- KoCommonGEN v2: A Benchmark for Navigating Korean Commonsense Reasoning Challenges in Large Language Models☆25Aug 24, 2024Updated last year
- An Open-Source Knowledge-Enhanced Multilingual Supervised Fine-tuning Dataset☆28Jan 19, 2025Updated last year
- Official Code for M-RᴇᴡᴀʀᴅBᴇɴᴄʜ: Evaluating Reward Models in Multilingual Settings (ACL 2025 Main)☆40May 16, 2025Updated 9 months ago
- Placeholder repository☆15Mar 16, 2022Updated 3 years ago
- Multilingual Large Language Models Evaluation Benchmark☆132Aug 21, 2024Updated last year
- ☆16May 8, 2024Updated last year
- Do Multilingual Language Models Think Better in English?☆42Aug 3, 2023Updated 2 years ago
- Difference-based Contrastive Learning for Korean Sentence Embeddings☆23Updated this week
- ReCross: Unsupervised Cross-Task Generalization via Retrieval Augmentation☆24May 1, 2022Updated 3 years ago
- Code and data for the paper "Turning English-centric LLMs Into Polyglots: How Much Multilinguality Is Needed?"☆26Jun 3, 2025Updated 8 months ago
- A Multilingual Replicable Instruction-Following Model☆96Jun 11, 2023Updated 2 years ago
- ☆15Nov 22, 2023Updated 2 years ago
- Academic paper PDF translator with Korean language focus, powered by AWS Bedrock and preserving formulas, charts, and layouts☆23Feb 13, 2026Updated 2 weeks ago
- A simple semi-supervised approach for creating huggingface data script loaders and upload to the hub.☆11Jun 23, 2024Updated last year
- Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]☆149Oct 27, 2024Updated last year
- [EMNLP 2023] The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning☆254Oct 31, 2023Updated 2 years ago
- Utility scripts for preprocessing Wikipedia texts for NLP☆78Apr 9, 2024Updated last year
- This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…☆31Aug 25, 2023Updated 2 years ago
- hllama is a library which aims to provide a set of utility tools for large language models.☆10Apr 16, 2024Updated last year
- ☆12Dec 6, 2024Updated last year
- ☆12Apr 17, 2024Updated last year
- The implementation of "Mitigating Hallucinations and Off-target Machine Translation with Source-Contrastive and Language-Contrastive Deco…☆36Aug 29, 2025Updated 6 months ago
- code for the table-based open domain question answering project, with paper title: "Reasoning over Hybrid Chain for Table-and-Text Open D…☆12Sep 16, 2022Updated 3 years ago
- Code and Data for ManyModalQA: Modality Disambiguation and QA over Diverse Inputs☆17Mar 2, 2020Updated 6 years ago
- A neural and statistical engine for accurately adding diacritics (Tashkeel) to Arabic text. First-place winner on Kaggle 🥇☆18May 29, 2025Updated 9 months ago
- Vision Large Language Models trained on M3IT instruction tuning dataset☆17Aug 16, 2023Updated 2 years ago
- [EMNLP'23] Official Code for "FOCUS: Effective Embedding Initialization for Monolingual Specialization of Multilingual Models"☆36Jun 7, 2025Updated 8 months ago
- [EMNLP 2022] TaCube: Pre-computing Data Cubes for Answering Numerical-Reasoning Questions over Tabular Data☆17May 17, 2023Updated 2 years ago
- Zero-shot Learning by Generating Task-specific Adapters☆14Apr 2, 2021Updated 4 years ago
- CAMeL Dataset☆15Apr 15, 2025Updated 10 months ago
- a set of scripts to easily convert all training data from huggingface into alpaca instruct or sharegpt format, which should allow for eas…☆18Mar 14, 2025Updated 11 months ago
- Data and Code for EMNLP 2022 paper "ReasTAP: Injecting Table Reasoning Skills During Pre-training via Synthetic Reasoning Examples"☆15Jun 4, 2023Updated 2 years ago
- Are foundation LMs multilingual knowledge bases? (EMNLP 2023)☆19Dec 8, 2023Updated 2 years ago
- ☆17Apr 11, 2024Updated last year
- A publishing website of a table collecting meta-learning-related papers in the area of human language processing.☆17Aug 2, 2021Updated 4 years ago
- Introducing Filtered Direct Preference Optimization (fDPO) that enhances language model alignment with human preferences by discarding lo…☆16Nov 27, 2024Updated last year
- SOTA Math Opensource LLM☆335Dec 12, 2023Updated 2 years ago
- Word acquisition in neural language models (TACL 2022).☆20Jan 30, 2025Updated last year
- YesBut - Multimodal Satire Comprehension Dataset☆18Oct 23, 2024Updated last year