Exploration-Lab / HLDC
☆13Updated 3 months ago
Alternatives and similar repositories for HLDC
Users that are interested in HLDC are comparing it to the libraries listed below
Sorting:
- OpenNyAI is a mission aimed at developing open source software and datasets to catalyze the creation of AI-powered solutions to improve a…☆40Updated last year
- ☆90Updated 3 months ago
- This repository contains the HiNER dataset released with our paper at LREC 2022☆15Updated last year
- Code Repository for the IndicXNLI paper.☆15Updated last year
- A benchmark for code-switched NLP, ACL 2020☆74Updated 11 months ago
- An instruction-based benchmark for text improvements.☆141Updated 2 years ago
- Pre-trained, multilingual sequence-to-sequence models for Indian languages☆47Updated 2 years ago
- LexGLUE: A Benchmark Dataset for Legal Language Understanding in English☆204Updated last year
- Google's BigBird (Jax/Flax & PyTorch) @ 🤗Transformers☆49Updated 2 years ago
- An assignment for CMU CS11-711 Advanced NLP, building NLP systems from scratch☆172Updated 2 years ago
- Code for Multilingual Eval of Generative AI paper published at EMNLP 2023☆68Updated last year
- Yet Another Neural Machine Translation Toolkit☆178Updated 2 months ago
- Efficient Attention for Long Sequence Processing☆94Updated last year
- 🔍 A statutory article retrieval dataset in French. (ACL 2022)☆39Updated last year
- indicTranslate v1 - Machine Translation for 11 Indic languages. For latest v2, check: https://github.com/AI4Bharat/IndicTrans2☆126Updated last year
- A pipeline for transliteration, spell correction, POS tagging and word sense disambiguation of Hinglish code mixed data to Hindi Devanaga…☆36Updated last year
- Code and Data for Evaluation WG☆41Updated 3 years ago
- Resources for cultural NLP research☆95Updated 3 weeks ago
- Python-based implementation of the Translate-Align-Retrieve method to automatically translate the SQuAD Dataset to Spanish.☆59Updated 2 years ago
- This repository contains the code for "Generating Datasets with Pretrained Language Models".☆188Updated 3 years ago
- ☆16Updated last year
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.☆93Updated 2 years ago
- ☆9Updated 3 years ago
- Define Transformers, T5 model and RoBERTa Encoder decoder model for product names generation☆48Updated 3 years ago
- A Python package to compute HONEST, a score to measure hurtful sentence completions in language models. Published at NAACL 2021.☆21Updated last month
- Pretraining, fine-tuning and evaluation scripts for IndicBERT-v2 and IndicXTREME☆94Updated last month
- This repository contains materials for the SIGIR 2022 tutorial on opinion summarization.☆34Updated 2 years ago
- Long Document Summarization Papers☆147Updated last year
- Neural information retrieval / Semantic search / Bi-encoders☆169Updated last year
- A reading list of up-to-date papers on NLP for Social Good.☆301Updated last year