rawsh / mirrorllmLinks

various experiments for scaling inference time compute with small reasoning models

☆17

Alternatives and similar repositories for mirrorllm

Users that are interested in mirrorllm are comparing it to the libraries listed below

Sorting:

minosvasilias / simple_grpo
Simple GRPO scripts and configurations.
☆59Updated 9 months ago
Mihaiii / backtrack_sampler
An easy-to-understand framework for LLM samplers that rewind and revise generated tokens
☆145Updated 9 months ago
emrgnt-cmplxty / zero-shot-replication
☆73Updated 2 years ago
s-smits / grpo-optuna
Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna
☆58Updated last month
teknium1 / transformers-gptq-quant
☆45Updated 2 years ago
teknium1 / LLM-Benchmark-Logs
Just a bunch of benchmark logs for different LLMs
☆119Updated last year
EduardTalianu / EntropixLab
entropix style sampling + GUI
☆27Updated last year
arcee-ai / DAM
☆55Updated last year
sdan / selfextend
an implementation of Self-Extend, to expand the context window via grouped attention
☆119Updated last year
interstellarninja / function-calling-eval
A framework for evaluating function calls made by LLMs
☆40Updated last year
Danau5tin / calculator_agent_rl
Training an LLM to use a calculator with multi-turn reinforcement learning, achieving a **62% absolute increase in evaluation accuracy**.
☆59Updated 6 months ago
emrgnt-cmplxty / SmolTrainer
☆21Updated 2 years ago
AnswerDotAI / ModernBERT-Instruct-mini-cookbook
☆51Updated 9 months ago
JoshuaPurtell / SmallBench
Small, simple agent task environments for training and evaluation
☆19Updated last year
AblateIt / finetune-study
Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.
☆82Updated 2 years ago
zbambergerNLP / strategic-debate-tot
A DSPy-based implementation of the tree of thoughts method (Yao et al., 2023) for generating persuasive arguments
☆92Updated last month
deployradiant / pychatml
Chat Markup Language conversation library
☆55Updated last year
automix-llm / automix
Mixing Language Models with Self-Verification and Meta-Verification
☆110Updated 11 months ago
OpenPipe / deductive-reasoning
Train your own SOTA deductive reasoning model
☆107Updated 8 months ago
ChrisHayduk / QLoRA-for-MLM
QLoRA for Masked Language Modeling
☆22Updated 2 years ago
Arize-ai / LLMTest_NeedleInAHaystack
Doing simple retrieval from LLM models at various context lengths to measure accuracy
☆106Updated 2 months ago
migtissera / Sensei
Generate Synthetic Data Using OpenAI, MistralAI or AnthropicAI
☆221Updated last year
euclaise / SlimTrainer
Full finetuning of large language models without large memory requirements
☆94Updated 2 months ago
huu4ontocord / MDEL
Multi-Domain Expert Learning
☆66Updated last year
Hannibal046 / nanoColBERT
Simple replication of [ColBERT-v1](https://arxiv.org/abs/2004.12832).
☆79Updated last year
allenai / CommonGen-Eval
Evaluating LLMs with CommonGen-Lite
☆91Updated last year
Alex-Gurung / ReasoningNCP
Official repo for Learning to Reason for Long-Form Story Generation
☆72Updated 7 months ago
ChrisHayduk / qlora-multi-gpu
QLoRA with Enhanced Multi GPU Support
☆37Updated 2 years ago
xingyaoww / LeTI
Official repo for NAACL 2024 Findings paper "LeTI: Learning to Generate from Textual Interactions."
☆66Updated 2 years ago
VikParuchuri / classified
Score LLM pretraining data with classifiers
☆54Updated 2 years ago