mim-solutions / bert_for_longer_textsLinks
BERT classification model for processing texts longer than 512 tokens. Text is first divided into smaller chunks and after feeding them to BERT, intermediate results are pooled. The implementation allows fine-tuning.
☆147Updated last year
Alternatives and similar repositories for bert_for_longer_texts
Users that are interested in bert_for_longer_texts are comparing it to the libraries listed below
Sorting:
- ☆372Updated 2 years ago
- Set of vectorizers that extract keyphrases with part-of-speech patterns from a collection of text documents and convert them into a docum…☆265Updated last year
- Efficient Attention for Long Sequence Processing☆98Updated 2 years ago
- OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models (accepted at EACL2021 demo track)☆793Updated this week
- Clustering sentence embeddings to extract message intent☆174Updated 4 years ago
- TweetNLP for all the NLP enthusiasts working on Twitter! The Python library tweetnlp provides a collection of useful tools to analyze/und…☆371Updated 9 months ago
- Comparing the Performance of LLMs: A Deep Dive into Roberta, Llama, and Mistral for Disaster Tweets Analysis with Lora☆51Updated 2 years ago
- A Simple but Powerful SOTA NER Model | Official Code For Label Supervised LLaMA Finetuning☆153Updated last year
- Zero and Few shot named entity & relationships recognition☆398Updated 3 months ago
- Code and experiments for *BERTopic: Neural topic modeling with a class-based TF-IDF procedure*☆83Updated 2 years ago
- Guideline following Large Language Model for Information Extraction☆421Updated last year
- ☆179Updated last year
- pyTorch implementation of Recurrence over BERT (RoBERT) based on this paper https://arxiv.org/abs/1910.10781 and comparison with pyTorch …☆82Updated 3 years ago
- [ACL-IJCNLP 2021] Automated Concatenation of Embeddings for Structured Prediction☆311Updated 3 years ago
- SpanMarker for Named Entity Recognition☆462Updated 11 months ago
- A collection of topic diversity measures for topic modeling☆48Updated 4 years ago
- Language model fine-tuning on NER with an easy interface and cross-domain evaluation. "T-NER: An All-Round Python Library for Transformer…☆396Updated 2 years ago
- Multimodal model for text and tabular data with HuggingFace transformers as building block for text data☆614Updated last year
- Creating class-based TF-IDF matrices☆91Updated 3 years ago
- TopicGPT: A Prompt-Based Framework for Topic Modeling (NAACL'24)☆371Updated 9 months ago
- Calculate perplexity on a text with pre-trained language models. Support MLM (eg. DeBERTa), recurrent LM (eg. GPT3), and encoder-decoder …☆165Updated 6 months ago
- Instruct LLMs for flat and nested NER. Fine-tuning Llama and Mistral models for instruction named entity recognition. (Instruction NER)☆87Updated last year
- PatentSBERTa: A Deep NLP based Hybrid Model for Patent Distance and Classification using Augmented SBERT☆103Updated last year
- Powerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: …☆337Updated 2 years ago
- Fine-tuning of Flan-5T LLM for text classification 🤖 focuses on adapting a state-of-the-art language model to enhance its ability to cla…☆44Updated last year
- A package to run embedded topic modelling with ETM. Adapted from the original at: https://github.com/adjidieng/ETM☆96Updated 2 years ago
- HDBSCAN Tuning for BERTopic Models☆49Updated 2 years ago
- [DEPRECATED] Adapt Transformer-based language models to new text domains☆86Updated last year
- A repo to explore different NLP tasks which can be solved using T5☆173Updated 4 years ago
- FastFit ⚡ When LLMs are Unfit Use FastFit ⚡ Fast and Effective Text Classification with Many Classes☆212Updated 3 months ago