Guangxuan-Xiao / GSM8K-eval
โ32Updated last year
Alternatives and similar repositories for GSM8K-eval:
Users that are interested in GSM8K-eval are comparing it to the libraries listed below
- [ICML 2024] Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modificationsโ71Updated last week
- Official Repository for The Paper: Safety Alignment Should Be Made More Than Just a Few Tokens Deepโ76Updated 8 months ago
- A simple toolkit for benchmarking LLMs on mathematical reasoning tasks. ๐งฎโจโ184Updated 10 months ago
- Code associated with Tuning Language Models by Proxy (Liu et al., 2024)โ106Updated 11 months ago
- Reference implementation for Token-level Direct Preference Optimization(TDPO)โ128Updated 3 weeks ago
- [NeurIPS 2024] How do Large Language Models Handle Multilingualism?โ27Updated 4 months ago
- A Survey on the Honesty of Large Language Modelsโ54Updated 3 months ago
- Repository for Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learningโ161Updated last year
- ๐ Paper list on decoding methods for LLMs and LVLMsโ31Updated 2 months ago
- Curation of resources for LLM mathematical reasoning, most of which are screened by @tongyx361 to ensure high quality and accompanied witโฆโ116Updated 7 months ago
- [ACL'24, Outstanding Paper] Emulated Disalignment: Safety Alignment for Large Language Models May Backfire!โ34Updated 7 months ago
- LLM Unlearningโ142Updated last year
- [ACL'24] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimizationโ70Updated 6 months ago
- โ50Updated last year
- A survey on harmful fine-tuning attack for large language modelโ145Updated this week
- โ13Updated 10 months ago
- FeatureAlignment = Alignment + Mechanistic Interpretabilityโ28Updated last month
- Code and Data Repo for [ICLR 2025] Paper "Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation"โ26Updated 2 months ago
- Language Imbalance Driven Rewarding for Multilingual Self-improvingโ15Updated 4 months ago
- [EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"โ105Updated 5 months ago
- This my attempt to create Self-Correcting-LLM based on the paper Training Language Models to Self-Correct via Reinforcement Learning by gโฆโ29Updated 2 months ago
- โ30Updated 5 months ago
- โ47Updated 7 months ago
- [NeurIPS'24] Official code for *๐ฏDART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*โ95Updated 2 months ago
- A curated list of LLM Interpretability related material - Tutorial, Library, Survey, Paper, Blog, etc..โ205Updated 4 months ago