Blackzxy / LoGAH
☆22Updated 4 months ago
Alternatives and similar repositories for LoGAH:
Users that are interested in LoGAH are comparing it to the libraries listed below
- Code for "Accelerating Training with Neuron Interaction and Nowcasting Networks"☆17Updated 3 weeks ago
- One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation☆36Updated 3 months ago
- ☆28Updated 3 months ago
- Official code implementation for the work Preference Alignment with Flow Matching (NeurIPS 2024)☆20Updated 2 months ago
- ☆24Updated 3 weeks ago
- Engineering the state of RNN language models (Mamba, RWKV, etc.)☆32Updated 8 months ago
- PyTorch implementation for "Long Horizon Temperature Scaling", ICML 2023☆20Updated last year
- ☆41Updated last year
- Triton Implementation of HyperAttention Algorithm☆46Updated last year
- ☆29Updated 3 months ago
- Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"☆25Updated 9 months ago
- Stick-breaking attention☆41Updated 2 weeks ago
- ☆30Updated 2 months ago
- The repository contains code for Adaptive Data Optimization☆20Updated last month
- ☆70Updated 5 months ago
- Official repository for the paper "Approximating Two-Layer Feedforward Networks for Efficient Transformers"☆36Updated last year
- A repository for research on medium sized language models.☆76Updated 8 months ago
- Efficient Scaling laws and collaborative pretraining.☆13Updated this week
- This repo is based on https://github.com/jiaweizzhao/GaLore☆23Updated 4 months ago
- Exploration into the Scaling Value Iteration Networks paper, from Schmidhuber's group☆36Updated 4 months ago
- Q-Probe: A Lightweight Approach to Reward Maximization for Language Models☆40Updated 7 months ago
- Official Code Repository for EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents (COLM 2024)☆27Updated 6 months ago
- ☆21Updated 2 months ago
- Source code for the paper "Positional Attention: Out-of-Distribution Generalization and Expressivity for Neural Algorithmic Reasoning"☆14Updated 2 weeks ago
- ☆44Updated last year
- Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount…☆52Updated last year
- Implementation of Spectral State Space Models☆16Updated 11 months ago
- JORA: JAX Tensor-Parallel LoRA Library (ACL 2024)☆32Updated 9 months ago
- Minimum Description Length probing for neural network representations☆18Updated this week