apple / ml-planner
☆40Updated 7 months ago
Related projects ⓘ
Alternatives and complementary repositories for ml-planner
- Language models scale reliably with over-training and on downstream tasks☆94Updated 7 months ago
- ☆62Updated last month
- Experiments for efforts to train a new and improved t5☆76Updated 6 months ago
- ☆50Updated last month
- Repository for the paper Stream of Search: Learning to Search in Language☆84Updated 3 months ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆83Updated last week
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆46Updated 2 months ago
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training☆112Updated 6 months ago
- ☆53Updated 9 months ago
- ☆61Updated 2 months ago
- Code for reproducing our paper "Not All Language Model Features Are Linear"☆60Updated last month
- ☆50Updated last week
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"☆38Updated 2 weeks ago
- ☆99Updated 3 months ago
- ☆72Updated 4 months ago
- Official implementation of MAIA, A Multimodal Automated Interpretability Agent☆62Updated 2 months ago
- M4 experiment logbook☆56Updated last year
- Scaling is a distributed training library and installable dependency designed to scale up neural networks, with a dedicated module for tr…☆46Updated last week
- Functional Benchmarks and the Reasoning Gap☆78Updated last month
- Public Inflection Benchmarks☆69Updated 8 months ago
- One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation☆29Updated 3 weeks ago
- [NeurIPS 2024] Goldfish Loss: Mitigating Memorization in Generative LLMs☆74Updated 2 weeks ago
- σ-GPT: A New Approach to Autoregressive Models☆59Updated 2 months ago
- A MAD laboratory to improve AI architecture designs 🧪☆95Updated 6 months ago
- Collection of autoregressive model implementation☆66Updated this week
- A repository for research on medium sized language models.☆74Updated 5 months ago
- ☆76Updated 6 months ago
- Official repository for the paper "Approximating Two-Layer Feedforward Networks for Efficient Transformers"☆36Updated 11 months ago
- [ICML 2023] "Outline, Then Details: Syntactically Guided Coarse-To-Fine Code Generation", Wenqing Zheng, S P Sharan, Ajay Kumar Jaiswal, …☆36Updated last year
- ☆74Updated last week