Atenrev / chessformersLinks
This is a PyTorch implementation of a Transformer Decoder based model that plays chess.
☆17Updated last year
Alternatives and similar repositories for chessformers
Users that are interested in chessformers are comparing it to the libraries listed below
Sorting:
- ☆108Updated 5 months ago
- Our solution for the arc challenge 2024☆186Updated 6 months ago
- Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)☆198Updated last year
- Alice in Wonderland code base for experiments and raw experiments data☆131Updated 3 months ago
- ☆213Updated 4 months ago
- code for training & evaluating Contextual Document Embedding models☆202Updated 7 months ago
- Repository for the paper Stream of Search: Learning to Search in Language☆152Updated 11 months ago
- Notebooks accompanying Anthropic's "Toy Models of Superposition" paper☆132Updated 3 years ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆181Updated 6 months ago
- Large multi-modal models (L3M) pre-training.☆224Updated 3 months ago
- MatFormer repo☆67Updated last year
- Implementation of the general framework for AMIE, from the paper "Towards Conversational Diagnostic AI", out of Google Deepmind☆72Updated last year
- Vision Language Models are Biased☆105Updated 2 weeks ago
- $100K or 100 Days: Trade-offs when Pre-Training with Academic Resources☆149Updated 3 months ago
- Sparse and discrete interpretability tool for neural networks☆65Updated last year
- Collection of autoregressive model implementation☆85Updated this week
- Experiments for efforts to train a new and improved t5☆76Updated last year
- PyTorch library for Active Fine-Tuning☆96Updated 3 months ago
- ICLR 2025 - official implementation for "I-Con: A Unifying Framework for Representation Learning"☆121Updated 6 months ago
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.☆174Updated 11 months ago
- Visualizing the internal board state of a GPT trained on chess PGN strings, and performing interventions on its internal board state and …☆218Updated last year
- ☆150Updated 4 months ago
- Code for NeurIPS'24 paper 'Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization'☆234Updated 5 months ago
- LLM-Merging: Building LLMs Efficiently through Merging☆208Updated last year
- Implementation of the Llama architecture with RLHF + Q-learning☆170Updated 11 months ago
- Fine tune Gemma 3 on an object detection task☆95Updated 5 months ago
- Deep learning library implemented from scratch in numpy. Mixtral, Mamba, LLaMA, GPT, ResNet, and other experiments.☆54Updated last year
- Understanding how features learned by neural networks evolve throughout training☆41Updated last year
- The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in languag…☆125Updated 3 months ago
- Create an AI capable of solving reasoning tasks it has never seen before☆96Updated last year