haizelabs / sphynx
Sphynx Hallucination Induction
☆52Updated last month
Alternatives and similar repositories for sphynx:
Users that are interested in sphynx are comparing it to the libraries listed below
- Red-Teaming Language Models with DSPy☆171Updated 3 weeks ago
- ☆20Updated 4 months ago
- Just a bunch of benchmark logs for different LLMs☆119Updated 7 months ago
- A subset of jailbreaks automatically discovered by the Haize Labs haizing suite.☆89Updated 9 months ago
- ☆80Updated 2 months ago
- An attribution library for LLMs☆37Updated 5 months ago
- ☆48Updated last year
- Vivaria is METR's tool for running evaluations and conducting agent elicitation research.☆80Updated this week
- Functional Benchmarks and the Reasoning Gap☆84Updated 5 months ago
- Using various instructor clients evaluating the quality and capabilities of extractions and reasoning.☆49Updated 5 months ago
- 🦾💻🌐 distributed training & serverless inference at scale on RunPod☆17Updated 9 months ago
- Code for our paper PAPILLON: PrivAcy Preservation from Internet-based and Local Language MOdel ENsembles☆22Updated 2 months ago
- Small, simple agent task environments for training and evaluation☆18Updated 4 months ago
- Track the progress of LLM context utilisation☆53Updated 7 months ago
- A framework-less approach to robust agent development.☆156Updated this week
- ☆123Updated last month
- Doing simple retrieval from LLM models at various context lengths to measure accuracy☆99Updated 11 months ago
- Use the OpenAI Batch tool to make async batch requests to the OpenAI API.☆95Updated last year
- Writing Blog Posts with Generative Feedback Loops!☆47Updated 11 months ago
- ☆110Updated 2 weeks ago
- Synthetic Data for LLM Fine-Tuning☆111Updated last year
- A new benchmark for measuring LLM's capability to detect bugs in large codebase.☆29Updated 9 months ago
- ☆31Updated last week