evintunador / templateGPTLinks
customizable template GPT code designed for easy novel architecture experimentation
☆26Updated 5 months ago
Alternatives and similar repositories for templateGPT
Users that are interested in templateGPT are comparing it to the libraries listed below
Sorting:
- A compact LLM pretrained in 9 days by using high quality data☆323Updated 5 months ago
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.☆172Updated 8 months ago
- ☆134Updated last year
- rl from zero pretrain, can it be done? yes.☆265Updated 3 weeks ago
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.☆322Updated 10 months ago
- Atropos is a Language Model Reinforcement Learning Environments framework for collecting and evaluating LLM trajectories through diverse …☆678Updated last week
- A comprehensive repository of reasoning tasks for LLMs (and beyond)☆450Updated 11 months ago
- Plotting (entropy, varentropy) for small LMs☆98Updated 3 months ago
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆105Updated 6 months ago
- ☆120Updated 8 months ago
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆146Updated 6 months ago
- One click templates for inferencing Language Models☆213Updated last month
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.☆200Updated last year
- EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language M…☆240Updated 10 months ago
- Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free☆232Updated 10 months ago
- ☆118Updated last year
- smolLM with Entropix sampler on pytorch☆150Updated 10 months ago
- Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024☆334Updated 4 months ago
- This is our own implementation of 'Layer Selective Rank Reduction'☆240Updated last year
- ☆135Updated 3 weeks ago
- A pipeline for LLM knowledge distillation☆107Updated 5 months ago
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆343Updated 9 months ago
- Micro Llama is a small Llama based model with 300M parameters trained from scratch with $500 budget☆161Updated last month
- Draw more samples☆193Updated last year
- code for training & evaluating Contextual Document Embedding models☆197Updated 4 months ago
- Long context evaluation for large language models☆221Updated 6 months ago
- An Open Source Toolkit For LLM Distillation☆724Updated 2 months ago
- Video+code lecture on building nanoGPT from scratch☆69Updated last year
- GRadient-INformed MoE☆264Updated 11 months ago
- ☆91Updated 3 months ago