ltgoslo / bert-in-context
Official implementation of "BERTs are Generative In-Context Learners"
☆23Updated 7 months ago
Alternatives and similar repositories for bert-in-context:
Users that are interested in bert-in-context are comparing it to the libraries listed below
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆53Updated 4 months ago
- Monet: Mixture of Monosemantic Experts for Transformers☆43Updated this week
- ☆69Updated 4 months ago
- One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation☆36Updated 3 months ago
- ☆19Updated 3 months ago
- Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"☆66Updated 2 months ago
- A general framework for inference-time scaling and steering of diffusion models with arbitrary rewards.☆26Updated this week
- PyTorch implementation for "Long Horizon Temperature Scaling", ICML 2023☆20Updated last year
- ☆78Updated 9 months ago
- Code for PHATGOOSE introduced in "Learning to Route Among Specialized Experts for Zero-Shot Generalization"☆80Updated 10 months ago
- ☆23Updated 2 months ago
- ReBase: Training Task Experts through Retrieval Based Distillation☆28Updated 6 months ago
- PyTorch implementation for MRL☆18Updated 10 months ago
- [NeurIPS 2024] Goldfish Loss: Mitigating Memorization in Generative LLMs☆81Updated 2 months ago
- Discovering Data-driven Hypotheses in the Wild☆51Updated last month
- PyTorch library for Active Fine-Tuning☆52Updated last week
- A fast implementation of T5/UL2 in PyTorch using Flash Attention☆75Updated this week
- ☆67Updated 5 months ago
- Aioli: A unified optimization framework for language model data mixing☆18Updated 2 months ago
- $100K or 100 Days: Trade-offs when Pre-Training with Academic Resources☆119Updated this week
- A repository for research on medium sized language models.☆76Updated 7 months ago
- Open source replication of Anthropic's Crosscoders for Model Diffing☆28Updated 2 months ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆90Updated last month
- gzip Predicts Data-dependent Scaling Laws☆33Updated 7 months ago
- ☆46Updated 2 months ago
- Minimum Description Length probing for neural network representations☆18Updated last week
- A MAD laboratory to improve AI architecture designs 🧪☆102Updated last month
- BPE modification that implements removing of the intermediate tokens during tokenizer training.☆25Updated last month
- Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"☆46Updated last year
- Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Fl…☆64Updated 5 months ago