eric-prog / GPU-Grants
GPUGrants - a list of GPU grants that I can think of
☆17Updated last month
Alternatives and similar repositories for GPU-Grants:
Users that are interested in GPU-Grants are comparing it to the libraries listed below
- Official implementation of "BERTs are Generative In-Context Learners"☆27Updated last month
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆55Updated 7 months ago
- [ICLR 2025] Monet: Mixture of Monosemantic Experts for Transformers☆65Updated 3 months ago
- Notebook and Scripts that showcase running quantized diffusion models on consumer GPUs☆38Updated 5 months ago
- Tiny re-implementation of MDM in style of LLaDA and nano-gpt speedrun☆48Updated last month
- Official repository of "LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging"☆25Updated 5 months ago
- Repository for the Q-Filters method (https://arxiv.org/pdf/2503.02812)☆30Updated last month
- Code for Zero-Shot Tokenizer Transfer☆127Updated 3 months ago
- MEXMA: Token-level objectives improve sentence representations☆40Updated 3 months ago
- ☆16Updated 6 months ago
- A basic pure pytorch implementation of flash attention☆16Updated 5 months ago
- Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts☆118Updated 6 months ago
- ☆25Updated last year
- Universal Neurons in GPT2 Language Models☆27Updated 10 months ago
- ☆78Updated 8 months ago
- ☆31Updated 3 months ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆105Updated 5 months ago
- Implementation of Bitune: Bidirectional Instruction-Tuning☆19Updated 10 months ago
- We study toy models of skill learning.☆25Updated 3 months ago
- Efficient encoder-decoder architecture for small language models (≤1B parameters) with cross-architecture knowledge distillation and visi…☆23Updated 2 months ago
- Easily run PyTorch on multiple GPUs & machines☆45Updated last month
- Code for paper "Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning"☆76Updated last year
- Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response format☆27Updated last year
- LLM attention pattern visualizer☆10Updated last year
- Prune transformer layers☆69Updated 10 months ago
- Collection of autoregressive model implementation☆85Updated this week
- [ACL 2024 Findings & ICLR 2024 WS] An Evaluator VLM that is open-source, offers reproducible evaluation, and inexpensive to use. Specific…☆67Updated 7 months ago
- This is the repo for the paper "PANGEA: A FULLY OPEN MULTILINGUAL MULTIMODAL LLM FOR 39 LANGUAGES"☆105Updated 4 months ago
- Official Repository of Are Your LLMs Capable of Stable Reasoning?☆25Updated last month
- These papers will provide unique insightful concepts that will broaden your perspective on neural networks and deep learning☆48Updated last year