IdoAmos / not-from-scratch
☆25Updated 3 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for not-from-scratch
- Official PyTorch Implementation of the Longhorn Deep State Space Model☆40Updated 3 months ago
- Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"☆24Updated 7 months ago
- [NeurIPS 2023 spotlight] Official implementation of HGRN in our NeurIPS 2023 paper - Hierarchically Gated Recurrent Neural Network for Se…☆61Updated 6 months ago
- ☆24Updated 8 months ago
- ☆50Updated 6 months ago
- ☆69Updated 8 months ago
- ☆53Updated 3 weeks ago
- ☆28Updated 7 months ago
- ☆45Updated 9 months ago
- Online Adaptation of Language Models with a Memory of Amortized Contexts (NeurIPS 2024)☆53Updated 3 months ago
- Repository for NPHardEval, a quantified-dynamic benchmark of LLMs☆48Updated 7 months ago
- Official implementation of Bootstrapping Language Models via DPO Implicit Rewards☆39Updated 3 months ago
- The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"☆32Updated last month
- Jax implementation of "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"☆12Updated 6 months ago
- ☆15Updated 4 months ago
- ☆44Updated last year
- ☆54Updated last month
- Code for paper "Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning"☆63Updated 9 months ago
- Advantage Leftover Lunch Reinforcement Learning (A-LoL RL): Improving Language Models with Advantage-based Offline Policy Gradients☆26Updated 2 months ago
- The official implementation of Self-Exploring Language Models (SELM)☆55Updated 5 months ago
- ☆24Updated 7 months ago
- ☆29Updated 2 months ago
- ☆35Updated 7 months ago
- Stick-breaking attention☆34Updated last week
- HGRN2: Gated Linear RNNs with State Expansion☆49Updated 3 months ago
- Minimal but scalable implementation of large language models in JAX☆26Updated 2 weeks ago
- ICML 2024 - Official Repository for EXO: Towards Efficient Exact Optimization of Language Model Alignment☆46Updated 5 months ago
- Self-Supervised Alignment with Mutual Information☆14Updated 5 months ago
- DiffuGPT and DiffuLLaMA: Scaling Diffusion Language Models via Adaptation from Autoregressive Models☆57Updated 3 weeks ago
- ☆45Updated 4 months ago