mim-solutions / bert_for_longer_textsLinks
BERT classification model for processing texts longer than 512 tokens. Text is first divided into smaller chunks and after feeding them to BERT, intermediate results are pooled. The implementation allows fine-tuning.
☆144Updated last year
Alternatives and similar repositories for bert_for_longer_texts
Users that are interested in bert_for_longer_texts are comparing it to the libraries listed below
Sorting:
- ☆367Updated last year
- Efficient Attention for Long Sequence Processing☆98Updated last year
- Language model fine-tuning on NER with an easy interface and cross-domain evaluation. "T-NER: An All-Round Python Library for Transformer…☆391Updated 2 years ago
- Set of vectorizers that extract keyphrases with part-of-speech patterns from a collection of text documents and convert them into a docum…☆265Updated 9 months ago
- A Simple but Powerful SOTA NER Model | Official Code For Label Supervised LLaMA Finetuning☆155Updated last year
- Text classification with Foundation Language Model LLaMA☆114Updated 2 years ago
- Code and experiments for *BERTopic: Neural topic modeling with a class-based TF-IDF procedure*☆81Updated last year
- Fine-tuning of Flan-5T LLM for text classification 🤖 focuses on adapting a state-of-the-art language model to enhance its ability to cla…☆44Updated 10 months ago
- Use Large Language Models like OpenAI's GPT-3.5 for data annotation and model enhancement. This framework combines human expertise with L…☆38Updated last year
- OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models (accepted at EACL2021 demo track)☆782Updated last year
- Calculate perplexity on a text with pre-trained language models. Support MLM (eg. DeBERTa), recurrent LM (eg. GPT3), and encoder-decoder …☆162Updated 2 months ago
- Comparing the Performance of LLMs: A Deep Dive into Roberta, Llama, and Mistral for Disaster Tweets Analysis with Lora☆51Updated last year
- Zero and Few shot named entity & relationships recognition☆385Updated 3 months ago
- pyTorch implementation of Recurrence over BERT (RoBERT) based on this paper https://arxiv.org/abs/1910.10781 and comparison with pyTorch …☆82Updated 2 years ago
- Active Learning for Text Classification in Python☆621Updated this week
- Clustering sentence embeddings to extract message intent☆175Updated 3 years ago
- Guideline following Large Language Model for Information Extraction☆392Updated 10 months ago
- [ACL-IJCNLP 2021] Automated Concatenation of Embeddings for Structured Prediction☆309Updated 2 years ago
- Powerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: …☆337Updated 2 years ago
- ☆60Updated 4 years ago
- Multimodal model for text and tabular data with HuggingFace transformers as building block for text data☆605Updated 10 months ago
- potato: portable text annotation tool☆349Updated last month
- SpanMarker for Named Entity Recognition☆451Updated 7 months ago
- A curated list of resources on document similarity measures (papers, tutorials, code, ...)☆252Updated 3 years ago
- [ACL 2022] LinkBERT: A Knowledgeable Language Model 😎 Pretrained with Document Links☆443Updated 3 years ago
- Long Document Summarization Papers☆149Updated 2 years ago
- A simple recipe for training and inferencing Transformer architecture for Multi-Task Learning on custom datasets. You can find two approa…☆96Updated 3 years ago
- ☆67Updated 4 years ago
- Creating class-based TF-IDF matrices☆90Updated 2 years ago
- Models to perform neural summarization (extractive and abstractive) using machine learning transformers and a tool to convert abstractive…☆438Updated 3 months ago