Exploration-Lab / HLDCLinks
☆14Updated 4 months ago
Alternatives and similar repositories for HLDC
Users that are interested in HLDC are comparing it to the libraries listed below
Sorting:
- OpenNyAI is a mission aimed at developing open source software and datasets to catalyze the creation of AI-powered solutions to improve a…☆40Updated last year
- ☆92Updated 3 months ago
- Code Repository for the IndicXNLI paper.☆15Updated last year
- Pretraining, fine-tuning and evaluation scripts for IndicBERT-v2 and IndicXTREME☆96Updated 2 months ago
- This repository contains the HiNER dataset released with our paper at LREC 2022☆15Updated 2 years ago
- Yet Another Neural Machine Translation Toolkit☆178Updated 2 months ago
- This repository is dedicated to development of code-mixed language resources.☆25Updated last year
- A pipeline for transliteration, spell correction, POS tagging and word sense disambiguation of Hinglish code mixed data to Hindi Devanaga…☆36Updated last year
- A benchmark for code-switched NLP, ACL 2020☆75Updated last year
- indicTranslate v1 - Machine Translation for 11 Indic languages. For latest v2, check: https://github.com/AI4Bharat/IndicTrans2☆127Updated last year
- Efficient Attention for Long Sequence Processing☆94Updated last year
- Description Describes the IndicNLP corpus and associated datasets☆172Updated 2 years ago
- A Python package to compute HONEST, a score to measure hurtful sentence completions in language models. Published at NAACL 2021.☆21Updated last month
- Pre-trained, multilingual sequence-to-sequence models for Indian languages☆47Updated 2 years ago
- LexGLUE: A Benchmark Dataset for Legal Language Understanding in English☆205Updated last year
- An assignment for CMU CS11-711 Advanced NLP, building NLP systems from scratch☆171Updated 2 years ago
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.☆93Updated 2 years ago
- SemEval 2024 Task 1 : Textual Semantic Relatedness☆25Updated 11 months ago
- Tools for evaluating the performance of MT metrics on data from recent WMT metrics shared tasks.☆108Updated 2 months ago
- IndicGenBench is a high-quality, multilingual, multi-way parallel benchmark for evaluating Large Language Models (LLMs) on 4 user-facing …☆50Updated 9 months ago
- Find informative examples to efficiently (human)-evaluate NLG models.☆11Updated last week
- Resources for cultural NLP research☆95Updated last month
- Code repository for "Introducing Airavata: Hindi Instruction-tuned LLM"☆59Updated 7 months ago
- Hinglish Text Classification☆30Updated last year
- Python-based implementation of the Translate-Align-Retrieve method to automatically translate the SQuAD Dataset to Spanish.☆59Updated 2 years ago
- Code for Multilingual Eval of Generative AI paper published at EMNLP 2023☆69Updated last year
- An instruction-based benchmark for text improvements.☆141Updated 2 years ago
- A repo to explore different NLP tasks which can be solved using T5☆172Updated 4 years ago
- This repository contains the code for "Generating Datasets with Pretrained Language Models".☆188Updated 3 years ago
- 🔍 A statutory article retrieval dataset in French. (ACL 2022)☆40Updated last year