lxx0628 / Prompting-Framework-SurveyLinks
A curated list of awesome publications and researchers on prompting framework updated and maintained by The Intelligent System Security (IS2).
☆84Updated 7 months ago
Alternatives and similar repositories for Prompting-Framework-Survey
Users that are interested in Prompting-Framework-Survey are comparing it to the libraries listed below
Sorting:
- Open Implementations of LLM Analyses☆106Updated 10 months ago
- ☆98Updated 11 months ago
- Codebase accompanying the Summary of a Haystack paper.☆79Updated 11 months ago
- 🔧 Compare how Agent systems perform on several benchmarks. 📊🚀☆100Updated last month
- Mixing Language Models with Self-Verification and Meta-Verification☆107Updated 8 months ago
- CodeSage: Code Representation Learning At Scale (ICLR 2024)☆112Updated 10 months ago
- Model, Code & Data for the EMNLP'23 paper "Making Large Language Models Better Data Creators"☆135Updated last year
- Meta-CoT: Generalizable Chain-of-Thought Prompting in Mixed-task Scenarios with Large Language Models☆97Updated last year
- We present the first systematic study on the scaling property of raw agents instantiated by LLMs. We find that performance scales with th…☆129Updated 10 months ago
- [NeurIPS 2024] Evaluation harness for SWT-Bench, a benchmark for evaluating LLM repository-level test-generation☆54Updated last week
- Lean implementation of various multi-agent LLM methods, including Iteration of Thought (IoT)☆120Updated 6 months ago
- Official repo of Respond-and-Respond: data, code, and evaluation☆103Updated last year
- ☆43Updated last year
- Astraios: Parameter-Efficient Instruction Tuning Code Language Models☆62Updated last year
- LLM reads a paper and produce a working prototype☆57Updated 4 months ago
- A list of LLM benchmark frameworks.☆70Updated last year
- Data and evaluation scripts for "CodePlan: Repository-level Coding using LLMs and Planning", FSE 2024☆74Updated last year
- ☆78Updated 11 months ago
- Formal-LLM: Integrating Formal Language and Natural Language for Controllable LLM-based Agents☆127Updated last year
- ☆78Updated last year
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆49Updated last year
- Systematic evaluation framework that automatically rates overthinking behavior in large language models.☆92Updated 3 months ago
- Code repo for "Agent Instructs Large Language Models to be General Zero-Shot Reasoners"☆115Updated 11 months ago
- ☆113Updated 3 months ago
- ☆59Updated 8 months ago
- Evaluating LLMs with fewer examples☆160Updated last year
- ☆109Updated 2 months ago
- Self-Reflection in LLM Agents: Effects on Problem-Solving Performance☆83Updated 9 months ago
- Accompanying code and SEP dataset for the "Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?" paper.☆54Updated 5 months ago
- Initiative to evaluate and rank the most popular LLMs across common task types based on their propensity to hallucinate.☆114Updated last month