mim-solutions / bert_for_longer_texts
BERT classification model for processing texts longer than 512 tokens. Text is first divided into smaller chunks and after feeding them to BERT, intermediate results are pooled. The implementation allows fine-tuning.
☆132Updated 7 months ago
Alternatives and similar repositories for bert_for_longer_texts:
Users that are interested in bert_for_longer_texts are comparing it to the libraries listed below
- Clustering sentence embeddings to extract message intent☆169Updated 3 years ago
- Set of vectorizers that extract keyphrases with part-of-speech patterns from a collection of text documents and convert them into a docum…☆255Updated 2 months ago
- ☆349Updated last year
- Creating class-based TF-IDF matrices☆82Updated 2 years ago
- A Simple but Powerful SOTA NER Model | Official Code For Label Supervised LLaMA Finetuning☆149Updated 10 months ago
- Efficient Attention for Long Sequence Processing☆91Updated last year
- HDBSCAN Tuning for BERTopic Models☆42Updated last year
- A repo to explore different NLP tasks which can be solved using T5☆170Updated 3 years ago
- Comparing the Performance of LLMs: A Deep Dive into Roberta, Llama, and Mistral for Disaster Tweets Analysis with Lora☆50Updated last year
- Powerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: …☆327Updated last year
- Applying BERT to named entity recognition in English and Russian.☆162Updated 2 years ago
- Text classification with Foundation Language Model LLaMA☆113Updated last year
- ☆61Updated 3 years ago
- ☆61Updated 3 years ago
- Code and experiments for *BERTopic: Neural topic modeling with a class-based TF-IDF procedure*☆72Updated last year
- A package to run embedded topic modelling with ETM. Adapted from the original at: https://github.com/adjidieng/ETM☆96Updated last year
- Guideline following Large Language Model for Information Extraction☆330Updated 2 months ago
- A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coher…☆1,211Updated last year
- Full named-entity (i.e., not tag/token) evaluation metrics based on SemEval’13☆166Updated 2 months ago
- OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models (accepted at EACL2021 demo track)☆744Updated 5 months ago
- Language model fine-tuning on NER with an easy interface and cross-domain evaluation. "T-NER: An All-Round Python Library for Transformer…☆381Updated last year
- [ACL-IJCNLP 2021] Automated Concatenation of Embeddings for Structured Prediction☆305Updated 2 years ago
- pyTorch implementation of Recurrence over BERT (RoBERT) based on this paper https://arxiv.org/abs/1910.10781 and comparison with pyTorch …☆80Updated 2 years ago
- ☆153Updated 7 months ago
- ☆42Updated last year
- A collaborative project to collect datasets in SEA languages, SEA regions, or SEA cultures.☆73Updated last month
- Long Document Summarization Papers☆140Updated last year
- Aligned Neural Topic Model (ANTM) for Exploring Evolving Topics: a dynamic neural topic model that uses document embeddings (data2vec) to…☆34Updated last year
- A Python library for calculating a large variety of metrics from text☆320Updated last month
- A Framework for Textual Entailment based Zero Shot text classification☆154Updated 10 months ago