MultilingualSIFT: Multilingual Supervised Instruction Fine-tuning
☆97Aug 15, 2023Updated 2 years ago
Alternatives and similar repositories for MultilingualSIFT
Users that are interested in MultilingualSIFT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- KoCommonGEN v2: A Benchmark for Navigating Korean Commonsense Reasoning Challenges in Large Language Models☆25Aug 24, 2024Updated last year
- An Open-Source Knowledge-Enhanced Multilingual Supervised Fine-tuning Dataset☆28Jan 19, 2025Updated last year
- A simple semi-supervised approach for creating huggingface data script loaders and upload to the hub.☆11Jun 23, 2024Updated 2 years ago
- Multilingual Large Language Models Evaluation Benchmark☆133Aug 21, 2024Updated last year
- ☆16May 8, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Placeholder repository☆15Mar 16, 2022Updated 4 years ago
- The implementation of "Mitigating Hallucinations and Off-target Machine Translation with Source-Contrastive and Language-Contrastive Deco…☆38Aug 29, 2025Updated 10 months ago
- Do Multilingual Language Models Think Better in English?☆42Aug 3, 2023Updated 2 years ago
- Code and data for the paper "Turning English-centric LLMs Into Polyglots: How Much Multilinguality Is Needed?"☆26Jun 3, 2025Updated last year
- EMNLP 2022: Analyzing and Evaluating Faithfulness in Dialogue Summarization☆13Mar 20, 2025Updated last year
- [EMNLP'23] Official Code for "FOCUS: Effective Embedding Initialization for Monolingual Specialization of Multilingual Models"☆37Jun 7, 2025Updated last year
- ☆15Nov 22, 2023Updated 2 years ago
- The official code repo and data hub of top_nsigma sampling strategy for LLMs.☆26Feb 11, 2025Updated last year
- German Alpaca Dataset (Cleaned + Translated)☆26Apr 6, 2023Updated 3 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- A Multilingual Replicable Instruction-Following Model☆97Jun 11, 2023Updated 3 years ago
- PyTorch implementation of the paper: Vector Projection Network for Few-shot Slot Tagging in Natural Language Understanding. Su Zhu, Ruish…☆18Nov 10, 2021Updated 4 years ago
- Are foundation LMs multilingual knowledge bases? (EMNLP 2023)☆19Dec 8, 2023Updated 2 years ago
- ReCross: Unsupervised Cross-Task Generalization via Retrieval Augmentation☆23May 1, 2022Updated 4 years ago
- This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…☆31Aug 25, 2023Updated 2 years ago
- [EMNLP 2023] The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning☆257Oct 31, 2023Updated 2 years ago
- ☆20Jul 24, 2024Updated last year
- A neural and statistical engine for accurately adding diacritics (Tashkeel) to Arabic text. First-place winner on Kaggle 🥇☆18May 29, 2025Updated last year
- SIB-200: A Simple, Inclusive, and Big Evaluation Dataset for Topic Classification in 200+ Languages and Dialects☆26May 20, 2026Updated last month
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- A simple Python implementation of ngram sunburst (nested pie chart) visualization showed in CoQA paper☆14Mar 12, 2019Updated 7 years ago
- Utility scripts for preprocessing Wikipedia texts for NLP☆78Apr 9, 2024Updated 2 years ago
- code for the table-based open domain question answering project, with paper title: "Reasoning over Hybrid Chain for Table-and-Text Open D…☆12Sep 16, 2022Updated 3 years ago
- A publishing website of a table collecting meta-learning-related papers in the area of human language processing.☆17Aug 2, 2021Updated 4 years ago
- hllama is a library which aims to provide a set of utility tools for large language models.☆10Apr 16, 2024Updated 2 years ago
- Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]☆146Oct 27, 2024Updated last year
- Difference-based Contrastive Learning for Korean Sentence Embeddings☆23Mar 11, 2026Updated 3 months ago
- This repository contains an extension of fairseq for pixel / visual representations of text for machine translation.☆37Feb 2, 2024Updated 2 years ago
- [ACL 2024 Demo] SeaLLMs - Large Language Models for Southeast Asia☆174Jul 30, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- The official github repo for MixEval-X, the first any-to-any, real-world benchmark.☆17Feb 15, 2025Updated last year
- BLOOM+1: Adapting BLOOM model to support a new unseen language☆74Mar 2, 2024Updated 2 years ago
- Code Repo for the ACL21 paper "Common Sense Beyond English: Evaluating and Improving Multilingual LMs for Commonsense Reasoning"☆23Oct 26, 2021Updated 4 years ago
- CAMeL Dataset☆15Apr 15, 2025Updated last year
- About Official PyTorch implementation of "Query-Efficient Black-Box Red Teaming via Bayesian Optimization" (ACL'23)☆15Jul 9, 2023Updated 2 years ago
- [EMNLP 2022] TaCube: Pre-computing Data Cubes for Answering Numerical-Reasoning Questions over Tabular Data☆17May 17, 2023Updated 3 years ago
- SOTA Math Opensource LLM☆336Dec 12, 2023Updated 2 years ago