Alx-AI / AI_Diplomacy
☆208Updated this week
Alternatives and similar repositories for AI_Diplomacy:
Users that are interested in AI_Diplomacy are comparing it to the libraries listed below
- Fast bare-bones BPE for modern tokenizer training☆151Updated 5 months ago
- Following master Karpathy with GPT-2 implementation and training, writing lots of comments cause I have memory of a goldfish☆170Updated 7 months ago
- The Tensor (or Array)☆427Updated 7 months ago
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆167Updated last week
- model activation visualiser☆90Updated this week
- small auto-grad engine inspired from Karpathy's micrograd and PyTorch☆250Updated 4 months ago
- The Multilayer Perceptron Language Model☆544Updated 7 months ago
- Deep learning for dummies. All the practical details and useful utilities that go into working with real models.☆782Updated 3 weeks ago
- A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.☆313Updated 2 weeks ago
- ☆106Updated 3 months ago
- This repo has all the basic things you'll need in-order to understand complete vision transformer architecture and its various implementa…☆213Updated 2 months ago
- Build your own visual reasoning model☆320Updated this week
- (WIP) A small but powerful, homemade PyTorch from scratch.☆540Updated this week
- Compiling useful links, papers, benchmarks, ideas, etc.☆41Updated last week
- procedural reasoning datasets☆541Updated this week
- The Autograd Engine☆587Updated 6 months ago
- Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free☆230Updated 4 months ago
- A repository consisting of paper/architecture replications of classic/SOTA AI/ML papers in pytorch☆149Updated this week
- Simple Transformer in Jax☆136Updated 9 months ago
- A puzzle to learn about prompting☆124Updated last year
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.☆168Updated 2 months ago
- UNet diffusion model in pure CUDA☆600Updated 9 months ago
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆91Updated 3 weeks ago
- ☆32Updated last month
- Learnings and programs related to CUDA☆370Updated last month
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆310Updated 3 months ago
- The history files when recording human interaction while solving ARC tasks☆97Updated this week
- An introduction to LLM Sampling☆77Updated 3 months ago
- Implementation of Diffusion Transformer (DiT) in JAX☆269Updated 9 months ago
- GPU Kernels☆155Updated this week