cognitivecomputations / q-starLinks

☆9

Alternatives and similar repositories for q-star

Users that are interested in q-star are comparing it to the libraries listed below

Sorting:

EduardTalianu / EntropixLab
entropix style sampling + GUI
☆26Updated 7 months ago
SebastianBodza / EnsembleForecasting
Using multiple LLMs for ensemble Forecasting
☆16Updated last year
cognitivecomputations / generate
☆28Updated last year
brendanhogan / completion_tree_view
☆9Updated last month
joey00072 / Attention-as-graph
alternative way to calculating self attention
☆18Updated last year
cg123 / bitnet
Modeling code for a BitNet b1.58 Llama-style model.
☆25Updated last year
matthewrenze / jhu-concise-cot
The Benefits of a Concise Chain of Thought on Problem Solving in Large Language Models
☆22Updated 6 months ago
zjunlp / DynamicKnowledgeCircuits
How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training
☆35Updated last month
OpenPipe / deductive-reasoning
Train your own SOTA deductive reasoning model
☆92Updated 2 months ago
glaive-ai / reflection_70b_training
☆17Updated 3 months ago
arcee-ai / DAM
☆49Updated 6 months ago
diicellman / dynamite-dogs
BH hackathon
☆14Updated last year
Xalp / ECHO
Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)
☆90Updated 4 months ago
severian42 / Computational-Model-for-Symbolic-Representations
Glyphs, acting as collaboratively defined symbols linking related concepts, add a layer of multidimensional semantic richness to user-AI …
☆48Updated 3 months ago
s-smits / grpo-optuna
Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna
☆53Updated 4 months ago
lechmazur / pgg_bench
Public Goods Game (PGG) Benchmark: Contribute & Punish is a multi-agent benchmark that tests cooperative and self-interested strategies a…
☆36Updated last month
teknium1 / ShareGPT-Builder
☆114Updated 5 months ago
SinatrasC / entropix
Entropy Based Sampling and Parallel CoT Decoding
☆17Updated 7 months ago
argilla-io / argilla-cookbook
Simple examples using Argilla tools to build AI
☆53Updated 6 months ago
HarleyCoops / smolThinker-.5B
A Qwen .5B reasoning model trained on OpenR1-Math-220k
☆14Updated 3 months ago
OpenMOSS / Lorsa
☆19Updated 3 weeks ago
ElleLeonne / Lightning-ReLoRA
A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.
☆33Updated last year
bdambrosio / AllTheWorldAPlay
All the world is a play, we are but actors in it.
☆50Updated this week
JD-P / RetroInstruct
Synthetic data derived by templating, few shot prompting, transformations on public domain corpora, and monte carlo tree search.
☆32Updated 3 months ago
cognitivecomputations / extract-expert
Extract a single expert from a Mixture Of Experts model using slerp interpolation.
☆17Updated last year
AtakanTekparmak / tiny_fnc_engine
tiny_fnc_engine is a minimal python library that provides a flexible engine for calling functions extracted from a LLM.
☆38Updated 8 months ago
portal-cornell / muCode
☆17Updated 3 months ago
axolotl-ai-cloud / grpo_code
A fast, local, and secure approach for training LLMs for coding tasks using GRPO with WebAssembly and interpreter feedback.
☆24Updated 2 months ago
SohamGovande / podplex
🦾💻🌐 distributed training & serverless inference at scale on RunPod
☆17Updated last year
rohinmanvi / Capability-Aware_and_Mid-Generation_Self-Evaluations
☆21Updated 5 months ago