ahmadmustafaanis / C4AI-Scholars-Challenge
☆13Updated 11 months ago
Related projects ⓘ
Alternatives and complementary repositories for C4AI-Scholars-Challenge
- Building GPT ...☆17Updated 2 months ago
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"☆84Updated 2 months ago
- ML/DL Math and Method notes☆57Updated 11 months ago
- Textbook on reinforcement learning from human feedback☆74Updated 2 weeks ago
- Collection of autoregressive model implementation☆66Updated last week
- HomebrewNLP in JAX flavour for maintable TPU-Training☆46Updated 9 months ago
- QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning P…☆34Updated last year
- Complete implementation of Llama2 with/without KV cache & inference 🚀☆47Updated 5 months ago
- Visualizations of the theory behind diffusion models.☆74Updated 6 months ago
- ☆46Updated last month
- Experiments on GPT-3's ability to fit numerical models in-context.☆14Updated 2 years ago
- Implementation of some personal helper functions for Einops, my most favorite tensor manipulation library ❤️☆52Updated last year
- Codes and files for the paper Are Emergent Abilities in Large Language Models just In-Context Learning☆34Updated 7 months ago
- Minimum Bayes Risk Decoding for Hugging Face Transformers☆55Updated 5 months ago
- Explorations into the recently proposed Taylor Series Linear Attention☆89Updated 2 months ago
- Resources from the EleutherAI Math Reading Group☆50Updated last month
- Official repository for the paper "Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks"☆60Updated 2 years ago
- ☆139Updated 2 months ago
- ☆72Updated 4 months ago
- Cyclemoid implementation for PyTorch☆87Updated 2 years ago
- Implementation of the proposed Spline-Based Transformer from Disney Research☆75Updated this week
- Implementation of the general framework for AMIE, from the paper "Towards Conversational Diagnostic AI", out of Google Deepmind☆52Updated last month
- ☆73Updated last year
- ☆122Updated this week
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training☆112Updated 6 months ago
- RAGs: Simple implementations of Retrieval Augmented Generation (RAG) Systems☆83Updated 7 months ago
- Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount…☆49Updated last year
- Prune transformer layers☆64Updated 5 months ago
- Contains my experiments with the `big_vision` repo to train ViTs on ImageNet-1k.☆22Updated last year