[ACL'24] MC^2: A Multilingual Corpus of Minority Languages in China (Tibetan, Uyghur, Kazakh, and Mongolian)
☆31Jan 17, 2026Updated 2 months ago
Alternatives and similar repositories for mc2_corpus
Users that are interested in mc2_corpus are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ACL'24 Findings] Teaching Large Language Models an Unseen Language on the Fly☆25Jan 6, 2026Updated 3 months ago
- ☆11May 28, 2024Updated last year
- A highlight tool for reading ArXiv papers☆31May 30, 2021Updated 4 years ago
- A Multi-tasking and Multi-stage Chinese Minority Pre-Trained Language Model☆12Jul 24, 2023Updated 2 years ago
- Source code and data for Counterfactual Recipe Generation: Exploring Models’ Compositional Generalization Ability in a Realistic Scenario…☆15Oct 25, 2022Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆17May 17, 2022Updated 3 years ago
- 🈵 Collected resources to learn/study Manchu (Manchurian Language). 满语滿族満州語入門。☆18Jun 7, 2023Updated 2 years ago
- Source code and data for The Magic of IF: Investigating Causal Reasoning Abilities in Large Language Models of Code (Findings of ACL 2023…☆31Jun 4, 2023Updated 2 years ago
- Code for Paper "Target-oriented Fine-tuning for Zero-Resource Named Entity Recognition"☆20Sep 28, 2022Updated 3 years ago
- A curated list of papers on LLMs and agents for scientific research and development☆87Dec 11, 2024Updated last year
- Are LLMs Capable of Data-based Statistical and Causal Reasoning? Benchmarking Advanced Quantitative Reasoning with Data☆47Feb 18, 2025Updated last year
- Inference Code for Paper "Harder Tasks Need More Experts: Dynamic Routing in MoE Models"☆70Jul 30, 2024Updated last year
- MergeNet-filter-ldr2hdr, detail in paper 《Reconstructing HDR Image from a Single Filtered LDR Image Base on a Deep HDR Merger Network》☆10Sep 11, 2019Updated 6 years ago
- ROCK Framework for Commonsense Causality Reasoning (CCR)☆10Jun 28, 2023Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆23Oct 14, 2024Updated last year
- [Machine Learning 2023] NaCL: Noise-Robust Cross-Domain Contrastive Learning for Unsupervised Domain Adaptation☆12Jul 8, 2023Updated 2 years ago
- Implementation of the DPD architecture and related experiments for the ACL 2024 paper "Semisupervised Neural Proto-Language Reconstructio…☆11Jul 22, 2024Updated last year
- A research of Manchu hypothesis of Voynich manuscript. It's an Oracle database with tabes, DML scripts, PLSQL functions and queries.☆16Jun 11, 2014Updated 11 years ago
- EmotionCircuits-LLM: A complete, reproducible framework for discovering and controlling emotion circuits in large language models.☆47Apr 7, 2026Updated last week
- The code for "VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by VIdeo SpatioTemporal Augmentation" [CVPR2025]☆21Feb 27, 2025Updated last year
- 受到self-instruct启发,除了通用LLM还能做垂直领域的小LLM实现定制效果,通过GPT获得question和answer来作为训练数据☆18May 12, 2023Updated 2 years ago
- ☆10Mar 22, 2024Updated 2 years ago
- Repository for ACL2021 paper: <Zero-shot Event Extraction via Transfer Learning: Challenges and Insights>.☆30Jan 5, 2023Updated 3 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- URIEL+ knowledge base for natural language processing☆17Dec 16, 2025Updated 4 months ago
- A Large-Scale Open-Domain Tabular Question Answering Dataset for the Real Estate Sector☆15Jun 26, 2025Updated 9 months ago
- 【大模型 & NLP & 算法大礼包】提供大量NLP、大模型和算法付费干货,一套拥有,学习&科研&工作不愁!☆30Sep 18, 2024Updated last year
- Generate a 1 million-sample warm-up dataset for neural machine translation from a 700 million-word Mongolian text corpus using the Google…☆18Jun 27, 2025Updated 9 months ago
- Tibetan-English translator for CLI☆16Jan 26, 2026Updated 2 months ago
- The official dataset of paper "Goal-Oriented Prompt Attack and Safety Evaluation for LLMs".☆21Feb 5, 2024Updated 2 years ago
- A Manchu dictionary website☆13Feb 26, 2026Updated last month
- FastNLP Implementation of several ABSA subtasks and models also can be found in https://gitee.com/ROGERDJQ/FastABSA.git.☆17Mar 18, 2023Updated 3 years ago
- A simple, Python-based, command-line runner for MGIZA++.☆10Mar 24, 2022Updated 4 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Code for paper: PoisonPrompt: Backdoor Attack on Prompt-based Large Language Models, IEEE ICASSP 2024. Demo//124.220.228.133:11107☆21Aug 10, 2024Updated last year
- [EMNLP 2024] The official GitHub repo for the paper "Course-Correction: Safety Alignment Using Synthetic Preferences"☆20Oct 2, 2024Updated last year
- ☆22Jun 1, 2023Updated 2 years ago
- A tool for extracting plain text and internal Wikipedia links from Wikipedia dumps☆11Apr 18, 2019Updated 6 years ago
- ☆13Nov 19, 2020Updated 5 years ago
- Papers about Opinion Triplet Extraction, inlcluding two subtasks: Aspect Sentiment Triplet Extraction (ASTE) and Aspect Sentiment Opinion…☆19Nov 17, 2021Updated 4 years ago
- Ziya-LLaMA-13B是IDEA基于LLaMa的130亿参数的大规模预训练模型,具备翻译,编程,文本分类,信息抽取,摘要,文案生成,常识问答和数学计算等能力。目前姜子牙通用大模型已完成大规模预训练、多任务有监督微调和人类反馈学习三阶段的训练过程。本文主要用于Ziya-…☆46Jun 9, 2023Updated 2 years ago