ml-jku / SDLG
SDLG is an efficient method to accurately estimate aleatoric semantic uncertainty in LLMs
☆25Updated 11 months ago
Alternatives and similar repositories for SDLG
Users that are interested in SDLG are comparing it to the libraries listed below
Sorting:
- ☆121Updated 4 months ago
- ☆81Updated last year
- Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount…☆53Updated last year
- A MAD laboratory to improve AI architecture designs 🧪☆115Updated 5 months ago
- ☆19Updated 3 weeks ago
- LTG-Bert☆32Updated last year
- Quantification of Uncertainty with Adversarial Models☆28Updated last year
- A collection of various LLM sampling methods implemented in pure Pytorch☆24Updated 5 months ago
- ☆68Updated 9 months ago
- Official implementation of "BERTs are Generative In-Context Learners"☆27Updated 2 months ago
- ☆67Updated 2 years ago
- Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"☆72Updated 6 months ago
- PyTorch implementation for "Long Horizon Temperature Scaling", ICML 2023☆20Updated last year
- Implementation of GateLoop Transformer in Pytorch and Jax☆88Updated 10 months ago
- Understand and test language model architectures on synthetic tasks.☆195Updated 2 months ago
- Repository for the code of the "PPL-MCTS: Constrained Textual Generation Through Discriminator-Guided Decoding" paper, NAACL'22☆65Updated 2 years ago
- Simple and scalable tools for data-driven pretraining data selection.☆23Updated 3 months ago
- Code Release for "Broken Neural Scaling Laws" (BNSL) paper☆58Updated last year
- Modalities, a PyTorch-native framework for distributed and reproducible foundation model training.☆77Updated last week
- Because we don't want a jupyter notebook mess...☆62Updated this week
- An annotated implementation of the Hyena Hierarchy paper☆33Updated last year
- Language models scale reliably with over-training and on downstream tasks☆97Updated last year
- ☆150Updated 9 months ago
- ☆31Updated 4 months ago
- Efficient LLM inference on Slurm clusters using vLLM.☆62Updated this week
- ☆54Updated last year
- ICLR dataset☆26Updated last week
- Sequence Modeling with Structured State Spaces☆64Updated 2 years ago
- QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning P…☆34Updated last year
- JAX/Flax implementation of the Hyena Hierarchy☆34Updated 2 years ago