ChnQ / TracingLLM
☆25Updated 11 months ago
Alternatives and similar repositories for TracingLLM
Users that are interested in TracingLLM are comparing it to the libraries listed below
Sorting:
- ☆36Updated 7 months ago
- SafeChain: Safety of Language Models with Long Chain-of-Thought Reasoning Capabilities☆13Updated last month
- [ICLR 2025] Code&Data for the paper "Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization"☆13Updated 10 months ago
- Code for "A Sober Look at Progress in Language Model Reasoning" paper☆45Updated last week
- The repository of the paper "REEF: Representation Encoding Fingerprints for Large Language Models," aims to protect the IP of open-source…☆43Updated 3 months ago
- Lightweight Adapting for Black-Box Large Language Models☆22Updated last year
- [ACL 2024] Code and data for "Machine Unlearning of Pre-trained Large Language Models"☆58Updated 7 months ago
- [ICLR'24] RAIN: Your Language Models Can Align Themselves without Finetuning☆92Updated 11 months ago
- In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation (ICML 2024)☆57Updated last year
- Pytorch implementation of Tree Preference Optimization (TPO) (Accepyed by ICLR'25)☆17Updated 2 weeks ago
- Implementation of the MATRIX framework (ICML 2024)☆51Updated last year
- Official code for "Decoding-Time Language Model Alignment with Multiple Objectives".☆22Updated 6 months ago
- Official repo for EMNLP'24 paper "SOUL: Unlocking the Power of Second-Order Optimization for LLM Unlearning"☆25Updated 7 months ago
- ☆23Updated last month
- RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models. NeurIPS 2024☆73Updated 7 months ago
- Official code for SEAL: Steerable Reasoning Calibration of Large Language Models for Free☆20Updated last month
- [ICML 2024] Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications☆76Updated last month
- Code for "CREAM: Consistency Regularized Self-Rewarding Language Models", ICLR 2025.☆21Updated 2 months ago
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"☆38Updated last year
- ☆22Updated 2 months ago
- [NeurIPS 2024] "Can Language Models Perform Robust Reasoning in Chain-of-thought Prompting with Noisy Rationales?"☆35Updated 3 months ago
- [AAAI 2024] MELO: Enhancing Model Editing with Neuron-indexed Dynamic LoRA☆25Updated last year
- The code of “Improving Weak-to-Strong Generalization with Scalable Oversight and Ensemble Learning”☆16Updated last year
- [NeurIPS 2024 D&B] Evaluating Copyright Takedown Methods for Language Models☆17Updated 9 months ago
- ☆28Updated 10 months ago
- A Survey on the Honesty of Large Language Models☆57Updated 5 months ago
- ☆42Updated 3 months ago
- ☆9Updated 6 months ago
- This is the official implementation of ScaleBiO: Scalable Bilevel Optimization for LLM Data Reweighting☆18Updated 9 months ago
- code for EMNLP 2024 paper: Neuron-Level Knowledge Attribution in Large Language Models☆32Updated 5 months ago