EntropyLabsAI / sentinel
A control plane to oversee agents operating in the wild
☆22Updated this week
Related projects ⓘ
Alternatives and complementary repositories for sentinel
- Repository for ACM India Summer School on Generative AI for Text☆11Updated 4 months ago
- Resources for skilling up in AI alignment research engineering. Covers basics of deep learning, mechanistic interpretability, and RL.☆200Updated 9 months ago
- Machine Learning for Alignment Bootcamp (MLAB).☆22Updated 2 years ago
- ☆47Updated 5 months ago
- ☆67Updated 2 weeks ago
- The nnsight package enables interpreting and manipulating the internals of deep learned models.☆406Updated this week
- Sparse autoencoders☆344Updated last week
- large population models☆214Updated 3 weeks ago
- Website for hosting the Open Foundation Models Cheat Sheet.☆257Updated 4 months ago
- ☆351Updated this week
- Uncertainty quantification with PyTorch☆329Updated 2 weeks ago
- Serbian LLM Eval.☆88Updated 8 months ago
- Deep Learning Fundamentals -- Code material and exercises☆349Updated 8 months ago
- Puzzles for exploring transformers☆325Updated last year
- Simple Transformer in Jax☆119Updated 5 months ago
- Mechanistic Interpretability Visualizations using React☆200Updated 4 months ago
- Deep learning for dummies. All the practical details and useful utilities that go into working with real models.☆717Updated last month
- A Jax-based library for designing and training transformer models from scratch.☆276Updated 2 months ago
- ☆391Updated last month
- METR Task Standard☆124Updated 3 weeks ago
- An introduction to LLM Sampling☆64Updated 2 weeks ago
- Draw more samples☆179Updated 4 months ago
- Keeping language models honest by directly eliciting knowledge encoded in their activations.☆186Updated this week
- This repository's goal is to precompile all past presentations of the Huggingface reading group☆46Updated 2 months ago
- Highly commented implementations of Transformers in PyTorch☆128Updated last year
- Minimal example scripts of the Hugging Face Trainer, focused on staying under 150 lines☆195Updated 6 months ago
- 🧠 Starter templates for doing interpretability research☆63Updated last year
- This repository collects all relevant resources about interpretability in LLMs☆289Updated 3 weeks ago
- ☆197Updated 4 months ago
- ☆21Updated last month