stan-anony / Zero-shot-EoT-Prompting
Zero-Shot Chain-of-Thought Reasoning Guided by Evolutionary Algorithms in Large Language Models
☆10Updated 6 months ago
Related projects: ⓘ
- ☆10Updated 8 months ago
- ☆16Updated 3 months ago
- This repository contains the code and data for the paper "SelfIE: Self-Interpretation of Large Language Model Embeddings" by Haozhe Chen,…☆30Updated 5 months ago
- This is official project in our paper: Is Bigger and Deeper Always Better? Probing LLaMA Across Scales and Layers☆20Updated 8 months ago
- This repo contains some extensions of deepspeed-chat for fine-tuning LLMs (SFT+RLHF).☆15Updated 2 months ago
- ☆13Updated 2 months ago
- Implementation of the ICML 2024 paper "Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning" pr…☆57Updated 7 months ago
- Framework for Cost-Effective Language Model Choice☆13Updated 9 months ago
- Codebase for Inference-Time Policy Adapters☆19Updated 10 months ago
- Minimal implementation of the Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models paper (ArXiv 20232401.01335)☆25Updated 6 months ago
- Survival of the Most Influential Prompts: Efficient Black-Box Prompt Search via Clustering and Pruning (Zhou et al.; EMNLP 2023 Findings)☆16Updated 7 months ago
- [ACL'24] Chain of Thought (CoT) is significant in improving the reasoning abilities of large language models (LLMs). However, the correla…☆28Updated last month
- [ICML 2024 Oral] A framework for society simulation that supports complex simulation, for example: multi-scene.☆39Updated last month
- Token-level adaptation of LoRA matrices for downstream task generalization.☆14Updated 5 months ago
- [ACL'24] Can LLMs Speak For Diverse People? Tuning LLMs via Debate to Generate Controllable Controversial Statements☆11Updated last week
- ☆10Updated 3 weeks ago
- This is the official project of paper: Compress to Impress: Unleashing the Potential of Compressive Memory in Real-World Long-Term Conver…☆14Updated 2 months ago
- ☆8Updated 4 months ago
- RoTBench: A Multi-Level Benchmark for Evaluating the Robustness of Large Language Models in Tool Learning☆12Updated 5 months ago
- Source code for our paper: "Put Your Money Where Your Mouth Is: Evaluating Strategic Planning and Execution of LLM Agents in an Auction A…☆39Updated 7 months ago
- Refined Direct Preference Optimization with Synthetic Data for Behavioral Alignment of LLMs☆11Updated 7 months ago
- Flow of Reasoning: Efficient Training of LLM Policy with Diverse Thinking☆25Updated this week
- Resources for our paper: "EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms"☆73Updated 2 months ago
- A language model (LM)-based emulation framework for identifying the risks of LM agents with tool use☆106Updated 5 months ago
- Parsimonious Concept Engineering (PaCE) uses sparse coding on a large-scale concept dictionary to effectively improve the trustworthiness…☆25Updated 3 months ago
- Towards a Mechanistic Interpretation of Multi-Step Reasoning Capabilities of Language Models☆11Updated 10 months ago
- The official implementation of Self-Exploring Language Models (SELM)☆55Updated 3 months ago
- Official repository for paper "GTA: A Benchmark for General Tool Agents"☆28Updated 2 months ago
- Code release for "Debating with More Persuasive LLMs Leads to More Truthful Answers"☆70Updated 5 months ago
- Benchmarking LLMs' Gaming Ability in Multi-Agent Environments☆33Updated this week