evintunador / templateGPTLinks
customizable template GPT code designed for easy novel architecture experimentation
☆26Updated 3 months ago
Alternatives and similar repositories for templateGPT
Users that are interested in templateGPT are comparing it to the libraries listed below
Sorting:
- ☆133Updated 10 months ago
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆101Updated 3 months ago
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆140Updated 4 months ago
- Collection of autoregressive model implementation☆85Updated 2 months ago
- ☆114Updated 6 months ago
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.☆173Updated 5 months ago
- code for training & evaluating Contextual Document Embedding models☆195Updated last month
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.☆198Updated 11 months ago
- ☆98Updated 5 months ago
- Source code of "How to Correctly do Semantic Backpropagation on Language-based Agentic Systems" 🤖☆70Updated 6 months ago
- prime-rl is a codebase for decentralized async RL training at scale☆347Updated this week
- rl from zero pretrain, can it be done? we'll see.☆56Updated this week
- Train your own SOTA deductive reasoning model☆94Updated 3 months ago
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.☆311Updated 8 months ago
- Synthetic data generation and benchmark implementation for "Episodic Memories Generation and Evaluation Benchmark for Large Language Mode…☆45Updated 2 months ago
- Code for ExploreTom☆84Updated 6 months ago
- ☆115Updated 4 months ago
- Utils for Unsloth☆99Updated this week
- Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…☆71Updated this week
- An efficent implementation of the method proposed in "The Era of 1-bit LLMs"☆154Updated 8 months ago
- NanoGPT-speedrunning for the poor T4 enjoyers☆66Updated 2 months ago
- Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)☆190Updated last year
- An overview of GRPO & DeepSeek-R1 Training with Open Source GRPO Model Fine Tuning☆32Updated last month
- So, I trained a Llama a 130M architecture I coded from ground up to build a small instruct model from scratch. Trained on FineWeb dataset…☆15Updated 3 months ago
- ☆127Updated 3 months ago
- Prune transformer layers☆69Updated last year
- smolLM with Entropix sampler on pytorch☆150Updated 7 months ago
- PyTorch implementation of models from the Zamba2 series.☆182Updated 5 months ago
- Long context evaluation for large language models☆217Updated 3 months ago
- Normalized Transformer (nGPT)☆184Updated 7 months ago