matttreed / diloco-sim
☆16Updated 3 months ago
Alternatives and similar repositories for diloco-sim:
Users that are interested in diloco-sim are comparing it to the libraries listed below
- ☆65Updated this week
- Synthetic data generation and benchmark implementation for "Episodic Memories Generation and Evaluation Benchmark for Large Language Mode…☆39Updated this week
- A collection of optimizers for MLX☆34Updated last month
- look how they massacred my boy☆63Updated 6 months ago
- Entropy Based Sampling and Parallel CoT Decoding☆17Updated 6 months ago
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆95Updated last month
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆39Updated 2 months ago
- ☆43Updated last year
- ☆37Updated 2 months ago
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆60Updated last month
- ☆27Updated 9 months ago
- ☆11Updated last month
- ☆38Updated 8 months ago
- ☆48Updated last year
- ☆48Updated last year
- Simple GRPO scripts and configurations.☆58Updated 2 months ago
- ☆13Updated 5 months ago
- ☆120Updated 3 weeks ago
- Collection of LLM completions for reasoning-gym task datasets☆18Updated this week
- NanoGPT (124M) quality in 2.67B tokens☆28Updated this week
- The code repository for the CURLoRA research paper. Stable LLM continual fine-tuning and catastrophic forgetting mitigation.☆43Updated 7 months ago
- direct preference optimization with only 1 model copy :)☆14Updated last year
- A 7B parameter model for mathematical reasoning☆27Updated 2 months ago
- ☆22Updated 6 months ago
- Using modal.com to process FineWeb-edu data☆20Updated 2 weeks ago
- EvaByte: Efficient Byte-level Language Models at Scale☆86Updated 3 weeks ago
- A fast, local, and secure approach for training LLMs for coding tasks using GRPO with WebAssembly and interpreter feedback.☆22Updated 2 weeks ago
- ☆20Updated 5 months ago
- ☆24Updated 3 months ago
- This repo is based on https://github.com/jiaweizzhao/GaLore☆26Updated 7 months ago