SamsungSAILMontreal / ninoLinks
Code for "Accelerating Training with Neuron Interaction and Nowcasting Networks" [ICLR 2025]
☆25Updated last month
Alternatives and similar repositories for nino
Users that are interested in nino are comparing it to the libraries listed below
Sorting:
- ☆29Updated 2 weeks ago
- ☆82Updated last year
- ☆88Updated last year
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆60Updated last year
- One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation☆45Updated last month
- Efficiently discovering algorithms via LLMs with evolutionary search and reinforcement learning.☆119Updated 3 weeks ago
- The official repository for HyperZ⋅Z⋅W Operator Connects Slow-Fast Networks for Full Context Interaction.☆42Updated 7 months ago
- Pytorch implementation of the PEER block from the paper, Mixture of A Million Experts, by Xu Owen He at Deepmind☆131Updated 3 weeks ago
- Python package for generating datasets to evaluate reasoning and retrieval of large language models☆19Updated 2 months ago
- Code and pretrained models for the paper: "MatMamba: A Matryoshka State Space Model"☆61Updated last year
- Fork of Flame repo for training of some new stuff in development☆19Updated this week
- Official repo of paper LM2☆46Updated 9 months ago
- Lottery Ticket Adaptation☆40Updated last year
- ☆13Updated 8 months ago
- Code for the paper Don't Pay Attention☆50Updated last month
- ☆33Updated 10 months ago
- ☆130Updated last month
- ☆47Updated last year
- Official Code Repository for EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents (COLM 2024)☆38Updated last year
- σ-GPT: A New Approach to Autoregressive Models☆69Updated last year
- UQ: Assessing Language Models on Unsolved Questions☆28Updated 2 months ago
- Resa: Transparent Reasoning Models via SAEs☆44Updated 2 months ago
- Official Repository for Task-Circuit Quantization☆24Updated 5 months ago
- A repository for research on medium sized language models.☆78Updated last year
- $100K or 100 Days: Trade-offs when Pre-Training with Academic Resources☆147Updated last month
- ☆33Updated last year
- ☆55Updated last year
- GoldFinch and other hybrid transformer components☆45Updated last year
- https://x.com/BlinkDL_AI/status/1884768989743882276☆28Updated 6 months ago
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"☆103Updated 11 months ago