Benchmarking Optimizers for LLM Pretraining
☆52Dec 30, 2025Updated 2 months ago
Alternatives and similar repositories for llm-optimizer-benchmark
Users that are interested in llm-optimizer-benchmark are comparing it to the libraries listed below
Sorting:
- 100M tokens, no time limit, best val loss wins!☆103Updated this week
- The Full Spectrum of Deepnet Hessians at Scale: Dynamics with SGD Training and Sample Size☆19May 19, 2019Updated 6 years ago
- ☆25Feb 20, 2026Updated last week
- ☆45Jul 21, 2025Updated 7 months ago
- A toolkit that provides a range of model diffing techniques including a UI to visualize them interactively.☆62Updated this week
- Repository for ACL 2020 paper: "Multi-Sentence Argument Linking"☆27Feb 11, 2023Updated 3 years ago
- ☆31Apr 19, 2025Updated 10 months ago
- ☆56Sep 17, 2025Updated 5 months ago
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.☆32Sep 19, 2025Updated 5 months ago
- ☆35Jul 5, 2023Updated 2 years ago
- Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"☆91Oct 30, 2024Updated last year
- CMPhysBench: A Benchmark for Evaluating Large Language Models in Condensed Matter Physics☆27Nov 1, 2025Updated 3 months ago
- ☆27Dec 3, 2025Updated 2 months ago
- 详细双语注释版word2vec源码,well-annotated word2vec☆10Oct 3, 2021Updated 4 years ago
- Code for the paper "Distinguishing the Knowable from the Unknowable with Language Models"☆11Apr 15, 2024Updated last year
- ☆11May 24, 2024Updated last year
- PSI-MOD ontology for modified and unmodified amino acid residues☆14Jan 8, 2026Updated last month
- ☆17Nov 18, 2025Updated 3 months ago
- ☆12Jun 19, 2024Updated last year
- ThetaEvolve: Test-time Learning on Open Problems, enabling RL training on AlphaEvolve/OpenEvolve and emphasizing scaling test-time comput…☆126Dec 29, 2025Updated 2 months ago
- ☆10Feb 4, 2025Updated last year
- Tutorials for MATH 4432 Statistical Machine Learning, HKUST, Fall 2022☆11Sep 17, 2024Updated last year
- ☆15Nov 18, 2025Updated 3 months ago
- ETL project to download and process both CME open interest data, COT data from the CFTC and NAV/shares-outstanding data from various ETF …☆12Jul 13, 2021Updated 4 years ago
- ☆12Apr 8, 2021Updated 4 years ago
- a library which can be used to create story driven clustered load-testing packages through a very readable and understandable api.☆30May 20, 2010Updated 15 years ago
- ☆12Jun 20, 2023Updated 2 years ago
- ☆18Jun 6, 2025Updated 8 months ago
- ☆13Jan 7, 2025Updated last year
- Korean Abstract Meaning Representation (AMR) Corpus☆10Feb 27, 2022Updated 4 years ago
- ☆23Jul 11, 2025Updated 7 months ago
- Generating Protein Variants with Different Generative Models (HMM, VAE, ESM-2, ProtGPT2)☆11Mar 14, 2024Updated last year
- Various CTF challenge solutions☆12Apr 20, 2021Updated 4 years ago
- Neuroevolution is a Competitive Alternative to Reinforcement Learning for Skill Discovery☆17Jul 11, 2023Updated 2 years ago
- Gaussian Splating 2d implemented in triton☆11Mar 19, 2024Updated last year
- Research for the Drone Aerial Tasking Manager☆10Jul 12, 2024Updated last year
- ☆11May 20, 2021Updated 4 years ago
- Post-processing peptide de novo sequences to improve their accuracy☆10Nov 29, 2022Updated 3 years ago
- Official Code for What Makes and Breaks Safety Fine-tuning? A Mechanistic Study (NeurIPS 2024)☆12Oct 31, 2024Updated last year