kyegomez / EvoVLM-JP
Plug in & Play Pytorch Implementation of the paper: "Evolutionary Optimization of Model Merging Recipes" by Sakana AI
☆26Updated this week
Related projects ⓘ
Alternatives and complementary repositories for EvoVLM-JP
- Unofficial Implementation of Evolutionary Model Merging☆33Updated 7 months ago
- The official implementation of Self-Exploring Language Models (SELM)☆56Updated 5 months ago
- ☆62Updated 3 months ago
- ☆62Updated last month
- ☆50Updated last month
- A repository for research on medium sized language models.☆74Updated 5 months ago
- ☆89Updated 4 months ago
- ☆61Updated 2 months ago
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆129Updated last month
- Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"☆91Updated last month
- ☆101Updated last month
- Challenge LLMs to Reason About Reasoning: A Benchmark to Unveil Cognitive Depth in LLMs☆41Updated 4 months ago
- ☆49Updated 6 months ago
- Triton Implementation of HyperAttention Algorithm☆46Updated 10 months ago
- Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Mode…☆76Updated last month
- ☆182Updated 3 weeks ago
- Language models scale reliably with over-training and on downstream tasks☆94Updated 7 months ago
- [NeurIPS 2024] Can LLMs Learn by Teaching for Better Reasoning? A Preliminary Study☆31Updated last week
- Token Omission Via Attention☆119Updated 3 weeks ago
- Code for PHATGOOSE introduced in "Learning to Route Among Specialized Experts for Zero-Shot Generalization"☆78Updated 8 months ago
- Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"☆44Updated 9 months ago
- Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)☆200Updated 5 months ago
- Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters☆104Updated last month
- This is the official repository for Inheritune.☆105Updated last month
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆46Updated 2 months ago
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆119Updated 2 weeks ago
- Code for reproducing our paper "Not All Language Model Features Are Linear"☆60Updated last month
- Self-playing Adversarial Language Game Enhances LLM Reasoning, NeurIPS 2024☆95Updated this week
- Collection of autoregressive model implementation☆66Updated this week
- The official repo for "LLoCo: Learning Long Contexts Offline"☆110Updated 4 months ago