ml-jku / SDLGLinks
SDLG is an efficient method to accurately estimate aleatoric semantic uncertainty in LLMs
☆25Updated last year
Alternatives and similar repositories for SDLG
Users that are interested in SDLG are comparing it to the libraries listed below
Sorting:
- ☆131Updated 2 weeks ago
- Modalities, a PyTorch-native framework for distributed and reproducible foundation model training.☆85Updated this week
- DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule☆63Updated last year
- Official implementation of "BERTs are Generative In-Context Learners"☆32Updated 4 months ago
- State-of-the-art paired encoder and decoder models (17M-1B params)☆38Updated last week
- Official implementation of "GPT or BERT: why not both?"☆57Updated last week
- LTG-Bert☆33Updated last year
- ☆81Updated last year
- Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount…☆53Updated last year
- ☆69Updated 11 months ago
- Minimum Bayes Risk Decoding for Hugging Face Transformers☆58Updated last year
- Efficient LLM inference on Slurm clusters using vLLM.☆69Updated this week
- Official Repository of Pretraining Without Attention (BiGS), BiGS is the first model to achieve BERT-level transfer learning on the GLUE …☆114Updated last year
- One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation☆41Updated 9 months ago
- PyTorch implementation for "Long Horizon Temperature Scaling", ICML 2023☆20Updated 2 years ago
- A collection of various LLM sampling methods implemented in pure Pytorch☆23Updated 8 months ago
- nanoGPT-like codebase for LLM training☆102Updated 2 months ago
- ☆54Updated 2 years ago
- A Toolkit for Distributional Control of Generative Models☆73Updated last week
- PyTorch library for Active Fine-Tuning☆88Updated 5 months ago
- A repository containing the code for translating popular LLM benchmarks to German.☆27Updated last year
- Simple-to-use scoring function for arbitrarily tokenized texts.☆45Updated 5 months ago
- ☆28Updated 5 months ago
- A fast implementation of T5/UL2 in PyTorch using Flash Attention☆107Updated 4 months ago
- Implementation of the BatchTopK activation function for training sparse autoencoders (SAEs)☆44Updated 2 weeks ago
- Interpretating the latent space representations of attention head outputs for LLMs☆34Updated 11 months ago
- Embedding Recycling for Language models☆39Updated 2 years ago
- Code Release for "Broken Neural Scaling Laws" (BNSL) paper☆59Updated last year
- Simple and scalable tools for data-driven pretraining data selection.☆25Updated 2 months ago
- Language models scale reliably with over-training and on downstream tasks☆97Updated last year