mim-solutions / bert_for_longer_textsLinks

BERT classification model for processing texts longer than 512 tokens. Text is first divided into smaller chunks and after feeding them to BERT, intermediate results are pooled. The implementation allows fine-tuning.

☆143

Alternatives and similar repositories for bert_for_longer_texts

Users that are interested in bert_for_longer_texts are comparing it to the libraries listed below

Sorting:

universal-ner / universal-ner
☆366Updated last year
dborrelli / chat-intents
Clustering sentence embeddings to extract message intent
☆175Updated 3 years ago
TimSchopf / KeyphraseVectorizers
Set of vectorizers that extract keyphrases with part-of-speech patterns from a collection of text documents and convert them into a docum…
☆265Updated 9 months ago
MIND-Lab / OCTIS
OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models (accepted at EACL2021 demo track)
☆779Updated last year
MaartenGr / BERTopic_evaluation
Code and experiments for *BERTopic: Neural topic modeling with a class-based TF-IDF procedure*
☆77Updated last year
asahi417 / tner
Language model fine-tuning on NER with an easy interface and cross-domain evaluation. "T-NER: An All-Round Python Library for Transformer…
☆389Updated 2 years ago
ccdv-ai / convert_checkpoint_to_lsg
Efficient Attention for Long Sequence Processing
☆97Updated last year
cardiffnlp / tweetnlp
TweetNLP for all the NLP enthusiasts working on Twitter! The Python library tweetnlp provides a collection of useful tools to analyze/und…
☆355Updated 4 months ago
hitz-zentroa / GoLLIE
Guideline following Large Language Model for Information Extraction
☆391Updated 9 months ago
IBM / zshot
Zero and Few shot named entity & relationships recognition
☆381Updated 3 months ago
4AI / LS-LLaMA
A Simple but Powerful SOTA NER Model | Official Code For Label Supervised LLaMA Finetuning
☆155Updated last year
jiehsheng / PatentBERT
☆67Updated 4 years ago
chtmp223 / topicGPT
TopicGPT: A Prompt-Based Framework for Topic Modeling (NAACL'24)
☆329Updated 4 months ago
saran9991 / llm-data-annotation
Use Large Language Models like OpenAI's GPT-3.5 for data annotation and model enhancement. This framework combines human expertise with L…
☆37Updated last year
mehdiir / Roberta-Llama-Mistral
Comparing the Performance of LLMs: A Deep Dive into Roberta, Llama, and Mistral for Disaster Tweets Analysis with Lora
☆51Updated last year
asahi417 / lm-question-generation
Multilingual/multidomain question generation datasets, models, and python library for question generation.
☆360Updated 11 months ago
Shivanandroy / simpleT5
simpleT5 is built on top of PyTorch-lightning⚡️ and Transformers🤗 that lets you quickly train your T5 models.
☆397Updated 2 years ago
aymeam / Datasets-for-Hate-Speech-Detection
Datasets for Hate Speech Detection
☆131Updated 2 years ago
AI-Growth-Lab / PatentSBERTa
PatentSBERTa: A Deep NLP based Hybrid Model for Patent Distance and Classification using Augmented SBERT
☆88Updated 9 months ago
poteminr / instruct-ner
Instruct LLMs for flat and nested NER. Fine-tuning Llama and Mistral models for instruction named entity recognition. (Instruction NER)
☆84Updated last year
sh0416 / llama-classification
Text classification with Foundation Language Model LLaMA
☆114Updated 2 years ago
MilaNLProc / contextualized-topic-models
A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coher…
☆1,240Updated 2 weeks ago
helmy-elrais / RoBERT_Recurrence_over_BERT
pyTorch implementation of Recurrence over BERT (RoBERT) based on this paper https://arxiv.org/abs/1910.10781 and comparison with pyTorch …
☆82Updated 2 years ago
webis-de / small-text
Active Learning for Text Classification in Python
☆621Updated last week
cardiffnlp / tweeteval
Repository for TweetEval
☆381Updated 3 years ago
UKPLab / gpl
Powerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: …
☆337Updated 2 years ago
tomaarsen / SpanMarkerNER
SpanMarker for Named Entity Recognition
☆444Updated 7 months ago
georgian-io / Multimodal-Toolkit
Multimodal model for text and tabular data with HuggingFace transformers as building block for text data
☆604Updated 9 months ago
MaartenGr / cTFIDF
Creating class-based TF-IDF matrices
☆88Updated 2 years ago
davidjurgens / potato
potato: portable text annotation tool
☆345Updated 2 weeks ago