Mithileysh / Email-DatasetsLinks
Email Datasets can be found here
☆77Updated 3 weeks ago
Alternatives and similar repositories for Email-Datasets
Users that are interested in Email-Datasets are comparing it to the libraries listed below
Sorting:
- A dataset for pretraining language models targeted for legal tasks.☆140Updated 3 years ago
- ☆55Updated 2 years ago
- [EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.☆111Updated last year
- Repository for Zheng and Guha et al., 2021, "When Does Pretraining Help? Assessing Self-Supervised Learning for Law and the CaseHOLD Data…☆95Updated 2 years ago
- ☆84Updated 2 years ago
- This repository contains an easy and intuitive approach to use SetFit in combination with spaCy.☆80Updated 2 years ago
- GraphER: A Structure-aware Text-to-Graph Model for Entity and Relation Extraction☆83Updated last year
- Tools to construct and process Common Crawl webgraphs☆104Updated 3 weeks ago
- Legal document similarity - Code, data, and models for the ICAIL 2021 paper "Evaluating Document Representations for Content-based Legal …☆32Updated 4 years ago
- multimodal document analysis☆166Updated 2 months ago
- Asent is a python library for performing efficient and transparent sentiment analysis using spaCy.☆120Updated 2 months ago
- 🤗 Disaggregators: Curated data labelers for in-depth analysis.☆67Updated 2 years ago
- Code for Relevance-guided Supervision for OpenQA with ColBERT (TACL'21)☆40Updated 4 years ago
- YT_subtitles - extracts subtitles from YouTube videos to raw text for Language Model training☆45Updated 5 years ago
- 💥 Use Hugging Face text and token classification pipelines directly in spaCy☆63Updated last year
- Logical structure analysis for visually structured documents☆94Updated 3 years ago
- 🤗 HuggingFace Inference Toolkit for Google Cloud Vertex AI (similar to SageMaker's Inference Toolkit, but for Vertex AI and unofficial)☆17Updated last year
- Accurate word segmentation for hashtags and text, powered by Transformers and Beam Search. A scalable alternative to heuristic splitters …☆76Updated last week
- A simple web application for searching Word2Vec embeddings derived from approximately 2,000 law reports published by the The Incorporated…☆26Updated 3 years ago
- Various Jupyter notebooks about Common Crawl data☆61Updated last month
- LexGLUE: A Benchmark Dataset for Legal Language Understanding in English☆233Updated 5 months ago
- ☆52Updated 5 months ago
- Code for SaGe subword tokenizer (EACL 2023)☆27Updated last year
- Plug-and-play Search Interfaces with Pyserini and Hugging Face☆32Updated 2 years ago
- A PyTorch-based open-source framework that provides methods for improving the weakly annotated data and allows researchers to efficiently…☆108Updated last year
- 💫 SpaCy wrapper for ConceptNet 💫☆95Updated 2 weeks ago
- The AI Knowledge Editor☆184Updated 3 years ago
- Simply, faster, sentence-transformers☆143Updated last year
- Coreference resolution for English, French, German and Polish, optimised for limited training data and easily extensible for further lang…☆133Updated last year
- A Python library aimed at dissecting and augmenting NER training data.☆60Updated 2 years ago