JulesGM / peft_qlora
β14Updated last year
Related projects β
Alternatives and complementary repositories for peft_qlora
- A MAD laboratory to improve AI architecture designs π§ͺβ95Updated 6 months ago
- Python library which enables complex compositions of language models such as scratchpads, chain of thought, tool use, selection-inferenceβ¦β197Updated 5 months ago
- Train very large language models in Jax.β195Updated last year
- Extract full next-token probabilities via language model APIsβ229Updated 9 months ago
- β128Updated 10 months ago
- git extension for {collaborative, communal, continual} model developmentβ205Updated last week
- Utilities for the HuggingFace transformers libraryβ61Updated last year
- NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Dayβ252Updated last year
- β107Updated this week
- NanoGPT-like codebase for LLM trainingβ75Updated this week
- β108Updated last year
- Code for the paper "The Impact of Positional Encoding on Length Generalization in Transformers", NeurIPS 2023β127Updated 6 months ago
- A framework for few-shot evaluation of autoregressive language models.β101Updated last year
- Seminar on Large Language Models (COMP790-101 at UNC Chapel Hill, Fall 2022)β308Updated 2 years ago
- A library to create and manage configuration files, especially for machine learning projects.β77Updated 2 years ago
- some common Huggingface transformers in maximal update parametrization (Β΅P)β76Updated 2 years ago
- β158Updated last year
- β132Updated last year
- Sparse and discrete interpretability tool for neural networksβ55Updated 9 months ago
- Emergent world representations: Exploring a sequence model trained on a synthetic taskβ170Updated last year
- Experiments with generating opensource language model assistantsβ97Updated last year
- Multipack distributed sampler for fast padding-free training of LLMsβ178Updated 3 months ago
- Tools for understanding how transformer predictions are built layer-by-layerβ430Updated 5 months ago
- Understand and test language model architectures on synthetic tasks.β162Updated 6 months ago
- β91Updated last year
- RuLES: a benchmark for evaluating rule-following in language modelsβ211Updated last month
- Chain-of-Hindsight, A Scalable RLHF Methodβ220Updated last year
- Erasing concepts from neural representations with provable guaranteesβ209Updated last week
- β240Updated 4 months ago
- Inference code for LLaMA models in JAXβ113Updated 6 months ago