mim-solutions / bert_for_longer_texts
BERT classification model for processing texts longer than 512 tokens. Text is first divided into smaller chunks and after feeding them to BERT, intermediate results are pooled. The implementation allows fine-tuning.
☆141Updated 10 months ago
Alternatives and similar repositories for bert_for_longer_texts:
Users that are interested in bert_for_longer_texts are comparing it to the libraries listed below
- Efficient Attention for Long Sequence Processing☆93Updated last year
- Set of vectorizers that extract keyphrases with part-of-speech patterns from a collection of text documents and convert them into a docum…☆262Updated 5 months ago
- Code and experiments for *BERTopic: Neural topic modeling with a class-based TF-IDF procedure*☆75Updated last year
- Clustering sentence embeddings to extract message intent☆173Updated 3 years ago
- Comparing the Performance of LLMs: A Deep Dive into Roberta, Llama, and Mistral for Disaster Tweets Analysis with Lora☆51Updated last year
- ☆360Updated last year
- ☆44Updated 2 years ago
- Use Large Language Models like OpenAI's GPT-3.5 for data annotation and model enhancement. This framework combines human expertise with L…☆35Updated last year
- ☆158Updated 10 months ago
- [ACL-IJCNLP 2021] Automated Concatenation of Embeddings for Structured Prediction☆306Updated 2 years ago
- Coreference Resolution☆76Updated 4 years ago
- A repo to explore different NLP tasks which can be solved using T5☆172Updated 4 years ago
- A text truncation method, useful for instance in long text classification☆23Updated 2 years ago
- A curated list of resources on document similarity measures (papers, tutorials, code, ...)☆246Updated 2 years ago
- HDBSCAN Tuning for BERTopic Models☆45Updated last year
- SpanMarker for Named Entity Recognition☆425Updated 3 months ago
- A Simple but Powerful SOTA NER Model | Official Code For Label Supervised LLaMA Finetuning☆154Updated last year
- Text classification with Foundation Language Model LLaMA☆115Updated 2 years ago
- pyTorch implementation of Recurrence over BERT (RoBERT) based on this paper https://arxiv.org/abs/1910.10781 and comparison with pyTorch …☆81Updated 2 years ago
- Language model fine-tuning on NER with an easy interface and cross-domain evaluation. "T-NER: An All-Round Python Library for Transformer…☆387Updated last year
- Guideline following Large Language Model for Information Extraction☆365Updated 5 months ago
- A package to run embedded topic modelling with ETM. Adapted from the original at: https://github.com/adjidieng/ETM☆95Updated last year
- Powerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: …☆333Updated last year
- Instruct LLMs for flat and nested NER. Fine-tuning Llama and Mistral models for instruction named entity recognition. (Instruction NER)☆82Updated 11 months ago
- ☆41Updated 3 years ago
- ☆61Updated 4 years ago
- FastFit ⚡ When LLMs are Unfit Use FastFit ⚡ Fast and Effective Text Classification with Many Classes☆197Updated 6 months ago
- Applying BERT to named entity recognition in English and Russian.☆162Updated 2 years ago
- A Framework for Textual Entailment based Zero Shot text classification☆152Updated last year
- A collection of topic diversity measures for topic modeling☆45Updated 3 years ago