Luckfort / CD
"Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers?"
☆55Updated this week
Related projects: ⓘ
- Co-LLM: Learning to Decode Collaboratively with Multiple Language Models☆89Updated 4 months ago
- Code and example data for the paper: Rule Based Rewards for Language Model Safety☆131Updated 2 months ago
- The official implementation of Self-Exploring Language Models (SELM)☆55Updated 3 months ago
- GTBench: Uncovering the Strategic Reasoning Limitations of LLMs via Game-Theoretic Evaluations☆43Updated 2 weeks ago
- Code release for "Debating with More Persuasive LLMs Leads to More Truthful Answers"☆70Updated 5 months ago
- Function Vectors in Large Language Models (ICLR 2024)☆107Updated last month
- [ACL 2024] Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models☆59Updated 3 months ago
- Self-Evolved Diverse Data Sampling for Efficient Instruction Tuning☆64Updated 9 months ago
- Parsimonious Concept Engineering (PaCE) uses sparse coding on a large-scale concept dictionary to effectively improve the trustworthiness…☆25Updated 3 months ago
- Official Code for paper "Towards Efficient and Effective Unlearning of Large Language Models for Recommendation" (Frontiers of Computer S…☆34Updated 2 months ago
- ☆32Updated 10 months ago
- A task generation and model evaluation system.☆51Updated 2 weeks ago
- ☆87Updated 2 months ago
- Public code repo for paper "SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales"☆82Updated 2 months ago
- Code associated with Tuning Language Models by Proxy (Liu et al., 2024)☆84Updated 5 months ago
- ☆42Updated 5 months ago
- Repository for NPHardEval, a quantified-dynamic benchmark of LLMs☆46Updated 5 months ago
- Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"☆44Updated 8 months ago
- 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623☆52Updated 3 weeks ago
- [ICLR'24] RAIN: Your Language Models Can Align Themselves without Finetuning☆79Updated 3 months ago
- ☆46Updated 2 weeks ago
- Weak-to-Strong Jailbreaking on Large Language Models☆62Updated 6 months ago
- AI Logging for Interpretability and Explainability🔬☆74Updated 3 months ago
- Scalable Meta-Evaluation of LLMs as Evaluators☆39Updated 7 months ago
- 🌍 Repository for "AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agent", ACL'24 Best Resource Pap…☆81Updated last month
- Official implementation of Goldfish Loss: Mitigating Memorization in Generative LLMs☆68Updated 2 months ago
- PASTA: Post-hoc Attention Steering for LLMs☆96Updated last week
- ☆25Updated 3 months ago
- "Improving Mathematical Reasoning with Process Supervision" by OPENAI☆55Updated last week
- Official repository for paper "GTA: A Benchmark for General Tool Agents"☆28Updated 2 months ago