sagorbrur / codeswitch
CodeSwitch is a NLP tool, can use for language identification, pos tagging, name entity recognition, sentiment analysis of code mixed data.
☆30Updated 3 years ago
Related projects: ⓘ
- A Benchmark Dataset for Understanding Disfluencies in Question Answering☆60Updated 3 years ago
- Dataset of sentences from Hindi stories tagged with different emotion tags☆10Updated 4 years ago
- The repository for the paper "When Do You Need Billions of Words of Pretraining Data?"☆20Updated 3 years ago
- Statistics on multilingual datasets☆17Updated 2 years ago
- A python tool for building large scale Wikipedia-based Information Retrieval datasets☆44Updated 3 years ago
- ☆37Updated 3 years ago
- a repository containing the details of natural language inference dataset in Hindi☆11Updated 3 years ago
- BERT models for many languages created from Wikipedia texts☆34Updated 4 years ago
- CodemixedNLP: An Extensible and Open NLP Toolkit for Code-Switching☆18Updated 3 years ago
- diagNNose is a Python library that facilitates a broad set of tools for analysing hidden activations of neural models.☆81Updated 10 months ago
- Training a model without a dataset for natural language inference (NLI)☆25Updated 4 years ago
- Build a dialog dataset from online books in many languages☆71Updated last year
- Tooling to play around with multilingual machine translation for Indian Languages.☆21Updated 2 years ago
- Code for the paper: Saying No is An Art: Contextualized Fallback Responses for Unanswerable Dialogue Queries☆19Updated 2 years ago
- Factorization of the neural parameter space for zero-shot multi-lingual and multi-task transfer☆39Updated 3 years ago
- Tokenization across languages. Useful as preprocessing for subword tokenization.☆21Updated last year
- Pre-trained, multilingual sequence-to-sequence models for Indian languages☆43Updated 2 years ago
- ☆23Updated 4 years ago
- This code provides word level language identification tool for identifying language for individual words in Code-Mixed text. e.g. The tex…☆49Updated 4 years ago
- A small repository to test Captum Explainable AI with a trained Flair transformers-based text classifier.☆25Updated 3 years ago
- A simple neural truecaser written in pytorch and allennlp.☆31Updated 3 months ago
- Gamma Agreement in Python☆43Updated 6 months ago
- Assessing syntactic abilities of BERT☆39Updated 5 years ago
- A lightweight but powerful library to build token indices for NLP tasks, compatible with major Deep Learning frameworks like PyTorch and …☆49Updated 3 years ago
- ☆16Updated last month
- A web interface to understand language-specific BERT-models☆17Updated 5 months ago
- This repository hosts the code for a tokenizer of tweets.☆12Updated 5 years ago
- How Contextual are Contextualized Word Representations?☆39Updated 4 years ago
- ☆73Updated 3 years ago
- codebase for the Text-based NP Enrichment (TNE) paper☆19Updated 6 months ago