jfpuget / ARC-AGI-Challenge-2024
☆36Updated this week
Related projects ⓘ
Alternatives and complementary repositories for ARC-AGI-Challenge-2024
- Implementation of the general framework for AMIE, from the paper "Towards Conversational Diagnostic AI", out of Google Deepmind☆53Updated 2 months ago
- Implementation of the proposed Spline-Based Transformer from Disney Research☆76Updated last week
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"☆85Updated 2 months ago
- Implementation of Infini-Transformer in Pytorch☆104Updated last month
- ☆53Updated 10 months ago
- Collection of autoregressive model implementation☆67Updated this week
- ☆128Updated this week
- σ-GPT: A New Approach to Autoregressive Models☆59Updated 3 months ago
- One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation☆29Updated last month
- Pytorch implementation of a simple way to enable (Stochastic) Frame Averaging for any network☆47Updated 3 months ago
- ☆48Updated last month
- ☆76Updated 7 months ago
- This is a port of Mistral-7B model in JAX☆30Updated 4 months ago
- ☆73Updated 4 months ago
- ☆49Updated 8 months ago
- ☆27Updated 6 months ago
- Implementation of the Llama architecture with RLHF + Q-learning☆157Updated 10 months ago
- Jax like function transformation engine but micro, microjax☆26Updated 3 weeks ago
- LLM training in simple, raw C/CUDA☆12Updated last month
- Muon optimizer for neural networks: >30% extra sample efficiency, <3% wallclock overhead☆109Updated last week
- Triton Implementation of HyperAttention Algorithm☆46Updated 11 months ago
- The official implementation of MARS: Unleashing the Power of Variance Reduction for Training Large Models☆65Updated this week
- Latent Diffusion Language Models☆67Updated last year
- Our solution for the arc challenge 2024☆32Updated last week
- A scalable implementation of diffusion and flow-matching with XGBoost models, applied to calorimeter data.☆17Updated 2 weeks ago
- ☆39Updated 10 months ago
- ☆53Updated 3 weeks ago
- Normalized Transformer (nGPT)☆66Updated this week
- A fast implementation of T5/UL2 in PyTorch using Flash Attention☆71Updated last month