VectorInstitute / flex_model
☆12Updated 8 months ago
Related projects ⓘ
Alternatives and complementary repositories for flex_model
- Influence Experiments☆35Updated last year
- LLM finetuning in resource-constrained environments.☆41Updated 4 months ago
- Skill-It! A Data-Driven Skills Framework for Understanding and Training Language Models☆41Updated last year
- AI Logging for Interpretability and Explainability🔬☆88Updated 5 months ago
- The original Backpack Language Model implementation, a fork of FlashAttention☆64Updated last year
- ☆33Updated 2 years ago
- ☆44Updated last year
- ☆78Updated 2 years ago
- Experiments and code to generate the GINC small-scale in-context learning dataset from "An Explanation for In-context Learning as Implici…☆94Updated last year
- A Kernel-Based View of Language Model Fine-Tuning https://arxiv.org/abs/2210.05643☆69Updated last year
- [ICML 2023] Exploring the Benefits of Training Expert Language Models over Instruction Tuning☆97Updated last year
- A user toolkit for analyzing and interfacing with Large Language Models (LLMs)☆21Updated 2 months ago
- Token-level Reference-free Hallucination Detection☆92Updated last year
- Code associated with the paper "Entropy-based Attention Regularization Frees Unintended Bias Mitigation from Lists"☆46Updated 2 years ago
- NanoGPT-like codebase for LLM training☆73Updated this week
- DEMix Layers for Modular Language Modeling☆53Updated 3 years ago
- Benchmark API for Multidomain Language Modeling☆24Updated 2 years ago
- ☆61Updated 2 years ago
- SILO Language Models code repository☆80Updated 8 months ago
- The official code of EMNLP 2022, "SCROLLS: Standardized CompaRison Over Long Language Sequences".☆68Updated 10 months ago
- We view Large Language Models as stochastic language layers in a network, where the learnable parameters are the natural language prompts…☆92Updated 3 months ago
- Adding new tasks to T0 without catastrophic forgetting☆30Updated 2 years ago
- ☆58Updated last year
- Official Repository for Dataset Inference for LLMs☆23Updated 3 months ago
- ☆12Updated 5 months ago
- Revisiting Efficient Training Algorithms For Transformer-based Language Models (NeurIPS 2023)☆79Updated last year
- Röttger et al. (2023): "XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models"☆61Updated 10 months ago
- Repo for ICML23 "Why do Nearest Neighbor Language Models Work?"☆56Updated last year
- Measuring the Mixing of Contextual Information in the Transformer☆25Updated last year
- CausalGym: Benchmarking causal interpretability methods on linguistic tasks☆29Updated 8 months ago