mim-solutions / bert_for_longer_textsLinks
BERT classification model for processing texts longer than 512 tokens. Text is first divided into smaller chunks and after feeding them to BERT, intermediate results are pooled. The implementation allows fine-tuning.
☆144Updated last year
Alternatives and similar repositories for bert_for_longer_texts
Users that are interested in bert_for_longer_texts are comparing it to the libraries listed below
Sorting:
- ☆369Updated last year
- Set of vectorizers that extract keyphrases with part-of-speech patterns from a collection of text documents and convert them into a docum…☆264Updated 11 months ago
- Clustering sentence embeddings to extract message intent☆175Updated 3 years ago
- Language model fine-tuning on NER with an easy interface and cross-domain evaluation. "T-NER: An All-Round Python Library for Transformer…☆395Updated 2 years ago
- Efficient Attention for Long Sequence Processing☆97Updated last year
- Zero and Few shot named entity & relationships recognition☆388Updated 3 weeks ago
- Use Large Language Models like OpenAI's GPT-3.5 for data annotation and model enhancement. This framework combines human expertise with L…☆37Updated 2 years ago
- OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models (accepted at EACL2021 demo track)☆787Updated last year
- Code and experiments for *BERTopic: Neural topic modeling with a class-based TF-IDF procedure*☆82Updated last year
- TopicGPT: A Prompt-Based Framework for Topic Modeling (NAACL'24)☆350Updated 6 months ago
- SpanMarker for Named Entity Recognition☆453Updated 9 months ago
- Guideline following Large Language Model for Information Extraction☆402Updated 11 months ago
- A Simple but Powerful SOTA NER Model | Official Code For Label Supervised LLaMA Finetuning☆155Updated last year
- Powerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: …☆337Updated 2 years ago
- Creating class-based TF-IDF matrices☆89Updated 2 years ago
- A collection of topic diversity measures for topic modeling☆47Updated 4 years ago
- TweetNLP for all the NLP enthusiasts working on Twitter! The Python library tweetnlp provides a collection of useful tools to analyze/und…☆359Updated 6 months ago
- Repository for TweetEval☆386Updated 3 years ago
- Multilingual/multidomain question generation datasets, models, and python library for question generation.☆364Updated last year
- Active Learning for Text Classification in Python☆628Updated last month
- [ACL-IJCNLP 2021] Automated Concatenation of Embeddings for Structured Prediction☆311Updated 2 years ago
- ☆68Updated 4 years ago
- A repo to explore different NLP tasks which can be solved using T5☆172Updated 4 years ago
- Full named-entity (i.e., not tag/token) evaluation metrics based on SemEval’13☆192Updated last month
- A Framework for Textual Entailment based Zero Shot text classification☆152Updated last year
- FastFit ⚡ When LLMs are Unfit Use FastFit ⚡ Fast and Effective Text Classification with Many Classes☆211Updated 3 weeks ago
- Aligned Neural Topic Model (ANTM) for Exploring Evolving Topics: a dynamic neural topic model that uses document embeddings (data2vec) to…☆37Updated last year
- Fine-tuning of Flan-5T LLM for text classification 🤖 focuses on adapting a state-of-the-art language model to enhance its ability to cla…☆43Updated 11 months ago
- A package to run embedded topic modelling with ETM. Adapted from the original at: https://github.com/adjidieng/ETM☆96Updated 2 years ago
- KeyPhraseTransformer lets you quickly extract key phrases, topics, themes from your text data with T5 transformer | Keyphrase extraction…☆105Updated last year