mim-solutions / bert_for_longer_textsLinks
BERT classification model for processing texts longer than 512 tokens. Text is first divided into smaller chunks and after feeding them to BERT, intermediate results are pooled. The implementation allows fine-tuning.
☆142Updated 11 months ago
Alternatives and similar repositories for bert_for_longer_texts
Users that are interested in bert_for_longer_texts are comparing it to the libraries listed below
Sorting:
- Efficient Attention for Long Sequence Processing☆94Updated last year
- Set of vectorizers that extract keyphrases with part-of-speech patterns from a collection of text documents and convert them into a docum…☆262Updated 6 months ago
- Long Document Summarization Papers☆147Updated last year
- Text classification with Foundation Language Model LLaMA☆115Updated 2 years ago
- ☆363Updated last year
- Use Large Language Models like OpenAI's GPT-3.5 for data annotation and model enhancement. This framework combines human expertise with L…☆36Updated last year
- Creating class-based TF-IDF matrices☆84Updated 2 years ago
- A Simple but Powerful SOTA NER Model | Official Code For Label Supervised LLaMA Finetuning☆155Updated last year
- A text truncation method, useful for instance in long text classification☆23Updated 2 years ago
- Guideline following Large Language Model for Information Extraction☆377Updated 7 months ago
- Code and experiments for *BERTopic: Neural topic modeling with a class-based TF-IDF procedure*☆76Updated last year
- Instruct LLMs for flat and nested NER. Fine-tuning Llama and Mistral models for instruction named entity recognition. (Instruction NER)☆83Updated last year
- Fine-tuning of Flan-5T LLM for text classification 🤖 focuses on adapting a state-of-the-art language model to enhance its ability to cla…☆39Updated 7 months ago
- A Python library for calculating a large variety of metrics from text☆339Updated 5 months ago
- A simple recipe for training and inferencing Transformer architecture for Multi-Task Learning on custom datasets. You can find two approa…☆96Updated 2 years ago
- SpanMarker for Named Entity Recognition☆431Updated 4 months ago
- Comparing the Performance of LLMs: A Deep Dive into Roberta, Llama, and Mistral for Disaster Tweets Analysis with Lora☆51Updated last year
- A Framework for Textual Entailment based Zero Shot text classification☆152Updated last year
- A repo to explore different NLP tasks which can be solved using T5☆172Updated 4 years ago
- pyTorch implementation of Recurrence over BERT (RoBERT) based on this paper https://arxiv.org/abs/1910.10781 and comparison with pyTorch …☆82Updated 2 years ago
- This repository provides details and links to the ACL anthology corpus/collection including .bib, .pdf and grobid extractions of the pdfs☆180Updated last year
- HDBSCAN Tuning for BERTopic Models☆47Updated 2 years ago
- FastFit ⚡ When LLMs are Unfit Use FastFit ⚡ Fast and Effective Text Classification with Many Classes☆206Updated 3 weeks ago
- Clustering sentence embeddings to extract message intent☆174Updated 3 years ago
- Building NER and RE components using HuggingFace Transformers☆50Updated 3 years ago
- SemEval2024-task 11: Bridging the Gap in Text-Based Emotion Detection☆49Updated last month
- Powerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: …☆333Updated last year
- Calculate perplexity on a text with pre-trained language models. Support MLM (eg. DeBERTa), recurrent LM (eg. GPT3), and encoder-decoder …☆157Updated 8 months ago
- TopicGPT: A Prompt-Based Framework for Topic Modeling (NAACL'24)☆303Updated 2 months ago
- ☆161Updated 11 months ago