dmarx / bench-warmers
DigThatData's Public Brainstorming space
☆66Updated this week
Alternatives and similar repositories for bench-warmers:
Users that are interested in bench-warmers are comparing it to the libraries listed below
- ☆30Updated 3 months ago
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"☆95Updated last month
- Cerule - A Tiny Mighty Vision Model☆67Updated 4 months ago
- Utilities for PyTorch distributed☆23Updated last year
- ☆33Updated 4 months ago
- Focused on fast experimentation and simplicity☆65Updated last month
- ☆49Updated 10 months ago
- Merge LLM that are split in to parts☆25Updated last year
- Implementation of the Mamba SSM with hf_integration.☆56Updated 5 months ago
- Collection of autoregressive model implementation☆78Updated 3 weeks ago
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated 10 months ago
- QLoRA for Masked Language Modeling☆21Updated last year
- ☆27Updated last year
- Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit☆63Updated last year
- Exploring an idea where one forgets about efficiency and carries out attention across each edge of the nodes (tokens)☆44Updated 3 months ago
- Experimental sampler to make LLMs more creative☆30Updated last year
- ☆48Updated last year
- Train Llama Loras Easily☆30Updated last year
- ☆24Updated 7 months ago
- ☆26Updated 10 months ago
- Official implementation of ECCV24 paper: POA☆24Updated 5 months ago
- ☆78Updated 9 months ago
- CLIP Guided Diffusion☆60Updated 10 months ago
- ☆51Updated last year
- [WIP] Transformer to embed Danbooru labelsets☆13Updated 10 months ago
- A repository of projects and datasets under active development by Alignment Lab AI☆22Updated last year
- ☆32Updated last year
- Simple implementation of muP, based on Spectral Condition for Feature Learning. The implementation is SGD only, dont use it for Adam☆73Updated 6 months ago
- Tools for content datamining and NLP at scale☆42Updated 7 months ago
- Modeling code for a BitNet b1.58 Llama-style model.☆23Updated 9 months ago