mim-solutions / bert_for_longer_textsLinks
BERT classification model for processing texts longer than 512 tokens. Text is first divided into smaller chunks and after feeding them to BERT, intermediate results are pooled. The implementation allows fine-tuning.
☆142Updated last year
Alternatives and similar repositories for bert_for_longer_texts
Users that are interested in bert_for_longer_texts are comparing it to the libraries listed below
Sorting:
- ☆367Updated last year
- Set of vectorizers that extract keyphrases with part-of-speech patterns from a collection of text documents and convert them into a docum…☆263Updated 8 months ago
- Code and experiments for *BERTopic: Neural topic modeling with a class-based TF-IDF procedure*☆77Updated last year
- TopicGPT: A Prompt-Based Framework for Topic Modeling (NAACL'24)☆320Updated 4 months ago
- Clustering sentence embeddings to extract message intent☆174Updated 3 years ago
- Efficient Attention for Long Sequence Processing☆95Updated last year
- Language model fine-tuning on NER with an easy interface and cross-domain evaluation. "T-NER: An All-Round Python Library for Transformer…☆389Updated 2 years ago
- Zero and Few shot named entity & relationships recognition☆381Updated 2 months ago
- A Simple but Powerful SOTA NER Model | Official Code For Label Supervised LLaMA Finetuning☆155Updated last year
- Powerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: …☆336Updated 2 years ago
- Guideline following Large Language Model for Information Extraction☆387Updated 8 months ago
- Comparing the Performance of LLMs: A Deep Dive into Roberta, Llama, and Mistral for Disaster Tweets Analysis with Lora☆51Updated last year
- Text classification with Foundation Language Model LLaMA☆114Updated 2 years ago
- OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models (accepted at EACL2021 demo track)☆772Updated 11 months ago
- SpanMarker for Named Entity Recognition☆437Updated 6 months ago
- Creating class-based TF-IDF matrices☆86Updated 2 years ago
- Active Learning for Text Classification in Python☆618Updated 3 weeks ago
- Use Large Language Models like OpenAI's GPT-3.5 for data annotation and model enhancement. This framework combines human expertise with L…☆38Updated last year
- Lbl2Vec learns jointly embedded label, document and word vectors to retrieve documents with predefined topics from an unlabeled document …☆186Updated last year
- A Framework for Textual Entailment based Zero Shot text classification☆152Updated last year
- A Topic Modeling System Toolkit (ACL 2024 Demo)☆259Updated 3 months ago
- FastFit ⚡ When LLMs are Unfit Use FastFit ⚡ Fast and Effective Text Classification with Many Classes☆208Updated 2 months ago
- TweetNLP for all the NLP enthusiasts working on Twitter! The Python library tweetnlp provides a collection of useful tools to analyze/und…☆351Updated 3 months ago
- simpleT5 is built on top of PyTorch-lightning⚡️ and Transformers🤗 that lets you quickly train your T5 models.☆395Updated 2 years ago
- [ACL-IJCNLP 2021] Automated Concatenation of Embeddings for Structured Prediction☆308Updated 2 years ago
- ☆66Updated 3 years ago
- Full named-entity (i.e., not tag/token) evaluation metrics based on SemEval’13☆182Updated this week
- A collection of topic diversity measures for topic modeling☆47Updated 3 years ago
- Long Document Summarization Papers☆148Updated last year
- pyTorch implementation of Recurrence over BERT (RoBERT) based on this paper https://arxiv.org/abs/1910.10781 and comparison with pyTorch …☆82Updated 2 years ago