technion-cs-nlp/LLMsKnow

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/technion-cs-nlp/LLMsKnow)

technion-cs-nlp / LLMsKnow

☆95

Alternatives and similar repositories for LLMsKnow

Users that are interested in LLMsKnow are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

harish-kamath / rqae
View on GitHub
Residual Quantization Autoencoder, used for interpreting LLMs
☆14Jan 1, 2025Updated last year
LLLeoLi / LARF
View on GitHub
[EMNLP 2025] Layer-Aware Representation Filtering: Purifying Finetuning Data to Preserve LLM Safety Alignment
☆15Jul 22, 2025Updated last year
cisnlp / GlotWeb
View on GitHub
[WWW 2026] 🕸 GlotWeb: Web Indexing for Minority Languages
☆17Apr 14, 2026Updated 3 months ago
yuzhaouoe / SAE-based-representation-engineering
View on GitHub
[NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering
☆83Jun 20, 2026Updated last month
AngelaZZZ-611 / reasoning_models_probing
View on GitHub
☆21May 14, 2026Updated 2 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
Jayfeather1024 / Backdoor-Enhanced-Alignment
View on GitHub
☆24Dec 8, 2024Updated last year
shacharKZ / VISIT-Visualizing-Transformers
View on GitHub
☆25Apr 3, 2024Updated 2 years ago
Awenbocc / LLM-OOD
View on GitHub
☆14Jul 24, 2024Updated 2 years ago
CLAIRE-Labo / quantile-reward-policy-optimization
View on GitHub
Official codebase for "Quantile Reward Policy Optimization: Alignment with Pointwise Regression and Exact Partition Functions" (Matrenok …
☆30Dec 8, 2025Updated 7 months ago
aladinD / SafeMERGE
View on GitHub
Code for SafeMERGE (ICLR 2025).
☆15Apr 1, 2025Updated last year
amazon-science / factual-confidence-of-llms
View on GitHub
Code for paper "Factual Confidence of LLMs: on Reliability and Robustness of Current Estimators"
☆17Dec 4, 2024Updated last year
sciai-lab / Truth_is_Universal
View on GitHub
☆34Nov 7, 2024Updated last year
nju-websoft / MAGIC
View on GitHub
Multi-Aspect Controllable Text Generation with Disentangled Counterfactual Augmentation, ACL 2024 (main)
☆14Sep 23, 2024Updated last year
avalonstrel / Mitigating-the-Alignment-Tax-of-RLHF
View on GitHub
☆16Feb 8, 2024Updated 2 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
carlini / llm-biographer
View on GitHub
Have an LLM write your biography, probably incorrectly
☆15Dec 26, 2024Updated last year
saprmarks / geometry-of-truth
View on GitHub
☆114Aug 8, 2024Updated last year
chanind / linear-relational
View on GitHub
Linear Relational Embeddings (LREs) and Linear Relational Concepts (LRCs) for LLMs in PyTorch
☆11Aug 7, 2024Updated last year
douglascrockford / MEC
View on GitHub
Modular Matrix Exponentiation Cryptography
☆10Nov 27, 2023Updated 2 years ago
mahtabbigverdi / Aurora
View on GitHub
☆12Dec 4, 2024Updated last year
Qwen-Applications / MARCH
View on GitHub
☆28Jun 9, 2026Updated last month
ezhangle / caskbench
View on GitHub
A Cairo/Skia Benchmark
☆11Oct 14, 2014Updated 11 years ago
AmourWaltz / UAlign
View on GitHub
Project of ACL 2025 "UAlign: Leveraging Uncertainty Estimations for Factuality Alignment on Large Language Models"
☆15Mar 25, 2025Updated last year
Trustworthy-Information-Access / LLM-Knowledge-Boundary-Perception-via-Internal-States
View on GitHub
Official code for the paper Towards Fully Exploiting LLM Internal States to Enhance Knowledge Boundary Perception. The code is based on t…
☆22Aug 5, 2025Updated 11 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Arthurizijar / Text_aligns_tokens
View on GitHub
The official code implementation of the ACL2025 paper “A Text is Worth Several Tokens: Text Embedding from LLMs Secretly Aligns Well with…
☆18Jul 12, 2025Updated last year
HITsz-TMG / ICL-State-Vector
View on GitHub
☆12Jul 4, 2024Updated 2 years ago
THU-KEG / VerIF
View on GitHub
[EMNLP 2025] Verification Engineering for RL in Instruction Following
☆57Mar 30, 2026Updated 3 months ago
PKU-YuanGroup / AsFT
View on GitHub
Code for the paper "AsFT: Anchoring Safety During LLM Fune-Tuning Within Narrow Safety Basin".
☆37Jul 10, 2025Updated last year
JacksonUptain / mariomak2jones
View on GitHub
Please star this and feel free to look up on mario maker
☆11Jan 24, 2023Updated 3 years ago
Lingkai-Kong / RE-Control
View on GitHub
Code for paper: Aligning Large Language Models with Representation Editing: A Control Perspective
☆35Jan 31, 2025Updated last year
dunzeng / MORE
View on GitHub
Code for EMNLP'24 paper - On Diversified Preferences of Large Language Model Alignment
☆16Aug 6, 2024Updated last year
Alibaba-AAIG / Oyster
View on GitHub
The Oyster series is a set of safety models developed in-house by Alibaba-AAIG, devoted to building a responsible AI ecosystem. | Oyster …
☆62Apr 29, 2026Updated 2 months ago
Trustworthy-ML-Lab / Linear-Explanations
View on GitHub
[ICML 24] A novel automated neuron explanation framework that can accurately describe poly-semantic concepts in deep neural networks
☆14May 2, 2025Updated last year
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
shachardon / naturally_occurring_feedback
View on GitHub
☆14Dec 1, 2025Updated 7 months ago
ApolloResearch / apd
View on GitHub
Attribution-based Parameter Decomposition
☆35Jun 11, 2025Updated last year
huanranchen / LLMLandscape
View on GitHub
The loss landscape of Large Language Models resemble basin!
☆41Jul 8, 2025Updated last year
reds-lab / Meta-Sift
View on GitHub
The official implementation of USENIX Security'23 paper "Meta-Sift" -- Ten minutes or less to find a 1000-size or larger clean subset on …
☆20Apr 27, 2023Updated 3 years ago
javiferran / sae_entities
View on GitHub
☆78Mar 6, 2025Updated last year
EleutherAI / deep-ignorance
View on GitHub
☆20Jan 7, 2026Updated 6 months ago
zijian678 / TDD
View on GitHub
☆14Apr 22, 2024Updated 2 years ago