robertvacareanu / llm4regressionLinks
Examining how large language models (LLMs) perform across various synthetic regression tasks when given (input, output) examples in their context, without any parameter update
β153Updated 11 months ago
Alternatives and similar repositories for llm4regression
Users that are interested in llm4regression are comparing it to the libraries listed below
Sorting:
- Swarming algorithms like PSO, Ant Colony, Sakana, and more in PyTorch πβ129Updated 2 weeks ago
- β88Updated last year
- β69Updated last year
- Simple GRPO scripts and configurations.β59Updated 6 months ago
- Mixing Language Models with Self-Verification and Meta-Verificationβ105Updated 8 months ago
- Train your own SOTA deductive reasoning modelβ104Updated 5 months ago
- β228Updated 2 weeks ago
- Codebase accompanying the Summary of a Haystack paper.β79Updated 11 months ago
- Code for NeurIPS LLM Efficiency Challengeβ59Updated last year
- β54Updated 9 months ago
- Banishing LLM Hallucinations Requires Rethinking Generalizationβ276Updated last year
- Automating enterprise workflows with multimodal agentsβ110Updated 10 months ago
- Evaluating LLMs with CommonGen-Liteβ91Updated last year
- Lean implementation of various multi-agent LLM methods, including Iteration of Thought (IoT)β118Updated 6 months ago
- Official implementation of MetaTree: Learning a Decision Tree Algorithm with Transformersβ113Updated 11 months ago
- β118Updated last year
- Code for ExploreTomβ86Updated 2 months ago
- Source code of "How to Correctly do Semantic Backpropagation on Language-based Agentic Systems" π€β73Updated 8 months ago
- π Reference-Free automatic summarization evaluation with potential hallucination detectionβ103Updated last year
- Set of scripts to finetune LLMsβ37Updated last year
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optunaβ55Updated 6 months ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absoluteβ¦β49Updated last year
- Doing simple retrieval from LLM models at various context lengths to measure accuracyβ102Updated last year
- β34Updated 3 weeks ago
- Functional Benchmarks and the Reasoning Gapβ88Updated 10 months ago
- β43Updated 10 months ago
- An introduction to LLM Samplingβ79Updated 8 months ago
- β134Updated last year
- β61Updated last year
- β52Updated last year