Scientific-Computing-Lab / MPI-rigen
MPI Code Generation through Domain-Specific Language Models
☆13Updated 8 months ago
Related projects ⓘ
Alternatives and complementary repositories for MPI-rigen
- Implementation of SelfExtend from the paper "LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning" from Pytorch and Zeta☆13Updated this week
- Lottery Ticket Adaptation☆35Updated last month
- Implementation of Spectral State Space Models☆17Updated 8 months ago
- A repository for research on medium sized language models.☆74Updated 5 months ago
- ☆24Updated last month
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated 8 months ago
- NeurIPS 2023 - Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer☆36Updated 7 months ago
- Efficient Dictionary Learning with Switch Sparse Autoencoders (SAEs)☆13Updated 3 weeks ago
- Training hybrid models for dummies.☆15Updated last week
- A testbed for agents and environments that can automatically improve models through data generation.☆12Updated last month
- My Implementation of Q-Sparse: All Large Language Models can be Fully Sparsely-Activated☆30Updated 2 months ago
- DPO, but faster 🚀☆20Updated last week
- ☆11Updated 3 weeks ago
- Visual RAG using less than 300 lines of code.☆23Updated 8 months ago
- An EXA-Scale repository of Multi-Modality AI resources from papers and models, to foundational libraries!☆41Updated 9 months ago
- Data preparation code for CrystalCoder 7B LLM☆42Updated 6 months ago
- ☆38Updated this week
- Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response format☆25Updated last year
- ☆43Updated 3 months ago
- Code for RATIONALYST: Pre-training Process-Supervision for Improving Reasoning https://arxiv.org/pdf/2410.01044☆30Updated last month
- Modified Beam Search with periodical restart☆12Updated last month
- Fast approximate inference on a single GPU with sparsity aware offloading☆38Updated 10 months ago
- Linear Attention Sequence Parallelism (LASP)☆64Updated 5 months ago
- Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit☆63Updated last year
- Q-Probe: A Lightweight Approach to Reward Maximization for Language Models☆37Updated 5 months ago
- A Data Source for Reasoning Embodied Agents☆19Updated last year
- My personal implementation of the model from "Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities", they haven't rel…☆11Updated 9 months ago
- Latent Large Language Models☆16Updated 2 months ago
- [ICML 24 NGSM workshop] Associative Recurrent Memory Transformer implementation and scripts for training and evaluating☆29Updated this week
- Using multiple LLMs for ensemble Forecasting☆16Updated 9 months ago