The-Swarm-Corporation / StatisticalModelEvaluatorLinks
An implementation of the Anthropic's paper and essay on "A statistical approach to model evaluations"
☆15Updated 2 months ago
Alternatives and similar repositories for StatisticalModelEvaluator
Users that are interested in StatisticalModelEvaluator are comparing it to the libraries listed below
Sorting:
- Simple GRPO scripts and configurations.☆58Updated 4 months ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆53Updated 4 months ago
- ☆38Updated 11 months ago
- NanoGPT (124M) quality in 2.67B tokens☆28Updated last month
- Latent Large Language Models☆18Updated 10 months ago
- A fast, local, and secure approach for training LLMs for coding tasks using GRPO with WebAssembly and interpreter feedback.☆30Updated 2 months ago
- BH hackathon☆14Updated last year
- LLM training in simple, raw C/CUDA☆14Updated 6 months ago
- ☆47Updated 4 months ago
- Minimum Description Length probing for neural network representations☆18Updated 4 months ago
- Official repository for "BLEUBERI: BLEU is a surprisingly effective reward for instruction following"☆23Updated 3 weeks ago
- ☆10Updated 2 months ago
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated last year
- QAlign is a new test-time alignment approach that improves language model performance by using Markov chain Monte Carlo methods.☆23Updated 2 months ago
- Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.☆17Updated 3 months ago
- A sample pattern for running CI tests on Modal☆18Updated 2 months ago
- Lego for GRPO☆28Updated 3 weeks ago
- ☆16Updated last year
- A new way to generate large quantities of high quality synthetic data (on par with GPT-4), with better controllability, at a fraction of …☆22Updated 8 months ago
- implementation of https://arxiv.org/pdf/2312.09299☆21Updated 11 months ago
- ☆26Updated last year
- Synthetic data generation and benchmark implementation for "Episodic Memories Generation and Evaluation Benchmark for Large Language Mode…☆45Updated 2 months ago
- Implementation of Spectral State Space Models☆16Updated last year
- PyTorch Implementation of the paper "MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training"☆24Updated last month
- ☆23Updated last year
- ☆23Updated 6 months ago
- This library supports evaluating disparities in generated image quality, diversity, and consistency between geographic regions.☆20Updated last year
- Verifiers for LLM Reinforcement Learning☆60Updated 2 months ago
- Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification☆11Updated last year
- GoldFinch and other hybrid transformer components☆45Updated 11 months ago