jiasenlu / LL3M
LL3M: Large Language and Multi-Modal Model in Jax
☆69Updated 9 months ago
Alternatives and similar repositories for LL3M:
Users that are interested in LL3M are comparing it to the libraries listed below
- M4 experiment logbook☆56Updated last year
- Language models scale reliably with over-training and on downstream tasks☆96Updated 10 months ago
- ☆82Updated 4 months ago
- Multimodal language model benchmark, featuring challenging examples☆158Updated 2 months ago
- Official github repo for the paper "Compression Represents Intelligence Linearly" [COLM 2024]☆130Updated 5 months ago
- [NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623☆77Updated 4 months ago
- ☆71Updated 6 months ago
- This repository is maintained to release dataset and models for multimodal puzzle reasoning.☆63Updated 2 weeks ago
- Replicating O1 inference-time scaling laws☆82Updated 2 months ago
- ☆75Updated 7 months ago
- PyTorch building blocks for the OLMo ecosystem☆54Updated this week
- A framework for few-shot evaluation of autoregressive language models.☆24Updated last year
- ☆95Updated 7 months ago
- Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"☆46Updated last year
- Self-Alignment with Principle-Following Reward Models☆154Updated 11 months ago
- Implementation of 🥥 Coconut, Chain of Continuous Thought, in Pytorch☆156Updated last month
- This repo is based on https://github.com/jiaweizzhao/GaLore☆24Updated 5 months ago
- Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"☆120Updated 3 months ago
- Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"☆70Updated 3 months ago
- Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Fl…☆65Updated 6 months ago
- Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"☆25Updated 10 months ago
- 🌾 OAT: A research-friendly framework for LLM online alignment, including preference learning, reinforcement learning, etc.☆194Updated last week
- some common Huggingface transformers in maximal update parametrization (µP)☆78Updated 2 years ago
- A basic pure pytorch implementation of flash attention☆16Updated 3 months ago
- Code for PHATGOOSE introduced in "Learning to Route Among Specialized Experts for Zero-Shot Generalization"☆81Updated 11 months ago
- Triton Implementation of HyperAttention Algorithm☆46Updated last year
- A repository for research on medium sized language models.☆76Updated 8 months ago
- Implementation of Infini-Transformer in Pytorch☆109Updated last month
- Code for Paper: Harnessing Webpage Uis For Text Rich Visual Understanding☆46Updated 2 months ago