mim-solutions / bert_for_longer_texts
BERT classification model for processing texts longer than 512 tokens. Text is first divided into smaller chunks and after feeding them to BERT, intermediate results are pooled. The implementation allows fine-tuning.
☆135Updated 8 months ago
Alternatives and similar repositories for bert_for_longer_texts:
Users that are interested in bert_for_longer_texts are comparing it to the libraries listed below
- Efficient Attention for Long Sequence Processing☆92Updated last year
- Set of vectorizers that extract keyphrases with part-of-speech patterns from a collection of text documents and convert them into a docum…☆257Updated 3 months ago
- Code and experiments for *BERTopic: Neural topic modeling with a class-based TF-IDF procedure*☆73Updated last year
- Powerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: …☆328Updated last year
- ☆349Updated last year
- Creating class-based TF-IDF matrices☆82Updated 2 years ago
- ☆61Updated 4 years ago
- ☆21Updated 8 months ago
- Clustering sentence embeddings to extract message intent☆170Updated 3 years ago
- ☆155Updated 8 months ago
- Text classification with Foundation Language Model LLaMA☆114Updated last year
- Applying BERT to named entity recognition in English and Russian.☆162Updated 2 years ago
- Neural information retrieval / Semantic search / Bi-encoders☆169Updated last year
- TopicGPT: A Prompt-Based Framework for Topic Modeling (NAACL'24)☆248Updated 3 weeks ago
- Named Entity Recognition in PyTorch on CoNLL2003 dataset☆16Updated 3 years ago
- Language model fine-tuning on NER with an easy interface and cross-domain evaluation. "T-NER: An All-Round Python Library for Transformer…☆383Updated last year
- A simple recipe for training and inferencing Transformer architecture for Multi-Task Learning on custom datasets. You can find two approa…☆94Updated 2 years ago
- Contains notebooks related to various transformers based models for different nlp based tasks☆42Updated last year
- Zero and Few shot named entity & relationships recognition☆358Updated 2 months ago
- ☆42Updated last year
- A repo to explore different NLP tasks which can be solved using T5☆172Updated 4 years ago
- Instruct LLMs for flat and nested NER. Fine-tuning Llama and Mistral models for instruction named entity recognition. (Instruction NER)☆80Updated 9 months ago
- Active Learning for Text Classification in Python☆605Updated 3 weeks ago
- Guideline following Large Language Model for Information Extraction☆343Updated 3 months ago
- Comparing the Performance of LLMs: A Deep Dive into Roberta, Llama, and Mistral for Disaster Tweets Analysis with Lora☆50Updated last year
- Building NER and RE components using HuggingFace Transformers☆50Updated 2 years ago
- Define Transformers, T5 model and RoBERTa Encoder decoder model for product names generation☆48Updated 3 years ago
- Long Document Summarization Papers☆141Updated last year
- ☆44Updated 2 years ago
- A collection of topic diversity measures for topic modeling☆45Updated 3 years ago