jxpress / setfit-pytorch-lightning
☆41Updated last year
Alternatives and similar repositories for setfit-pytorch-lightning:
Users that are interested in setfit-pytorch-lightning are comparing it to the libraries listed below
- ☆46Updated 2 years ago
- Source code and data for Like a Good Nearest Neighbor☆28Updated 2 weeks ago
- QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning P…☆34Updated last year
- Ranking of fine-tuned HF models as base models.☆35Updated last year
- Tokenization across languages. Useful as preprocessing for subword tokenization.☆22Updated last year
- ☆12Updated 6 months ago
- Data Programming by Demonstration (DPBD) for Document Classification☆35Updated 3 years ago
- A library for squeakily cleaning and filtering language datasets.☆45Updated last year
- ☆45Updated 2 years ago
- Stabilize and achieve excellent performance with transformers☆41Updated 2 years ago
- ☆11Updated 3 years ago
- Code & Data for Comparative Opinion Summarization via Collaborative Decoding (Iso et al; Findings of ACL 2022)☆21Updated last year
- No Parameter Left Behind: How Distillation and Model Size Affect Zero-Shot Retrieval☆28Updated 2 years ago
- Embedding Recycling for Language models☆38Updated last year
- A Python library aimed at dissecting and augmenting NER training data.☆57Updated last year
- ☆30Updated 2 years ago
- ☆21Updated 3 years ago
- 🚀 A demonstration of hyperparameter optimization using Optuna for models implemented with AllenNLP.☆16Updated 4 years ago
- ☆42Updated last year
- Starbucks: Improved Training for 2D Matryoshka Embeddings☆17Updated 3 months ago
- doccano auto labeling pipeline helps doccano to annotate a document automatically.☆40Updated last year
- ☆29Updated 11 months ago
- Using short models to classify long texts☆21Updated last year
- 🤗 HuggingFace Inference Toolkit for Google Cloud Vertex AI (similar to SageMaker's Inference Toolkit, but for Vertex AI and unofficial)☆17Updated 10 months ago
- Versatile framework designed to streamline the integration of your models, as well as those sourced from Hugging Face, into complex progr…☆27Updated last month
- Mr. TyDi is a multi-lingual benchmark dataset built on TyDi, covering eleven typologically diverse languages.☆73Updated 2 years ago
- This repository contains code for cleaning your training data of benchmark data to help combat data snooping.☆25Updated last year
- Pre-train Static Word Embeddings☆42Updated this week
- This repository contains the code for the paper 'PARM: Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval' pu…☆40Updated 3 years ago
- ☆13Updated last year