hud-evals / hud-sdk
HUD SDK
☆21Updated this week
Alternatives and similar repositories for hud-sdk
Users that are interested in hud-sdk are comparing it to the libraries listed below
Sorting:
- Verdict is a library for scaling judge-time compute.☆211Updated 2 weeks ago
- ☆125Updated last month
- ☆49Updated this week
- Draw more samples☆189Updated 10 months ago
- Open source interpretability artefacts for R1.☆109Updated 3 weeks ago
- ☆97Updated 7 months ago
- ☆74Updated 3 weeks ago
- ☆22Updated 6 months ago
- A reading list of relevant papers and projects on foundation model annotation☆27Updated 2 months ago
- Sphynx Hallucination Induction☆54Updated 3 months ago
- PyTorch library for Active Fine-Tuning☆72Updated 3 months ago
- Code for Columbia University COMS 3997 – LLM Ethics and Foundations☆14Updated 4 months ago
- ☆111Updated 4 months ago
- Red-Teaming Language Models with DSPy☆192Updated 3 months ago
- WorkArena: How Capable are Web Agents at Solving Common Knowledge Work Tasks?☆184Updated last week
- 🦾💻🌐 distributed training & serverless inference at scale on RunPod☆17Updated 11 months ago
- ☆114Updated 2 months ago
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.☆172Updated 4 months ago
- ☆93Updated 7 months ago
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆173Updated 2 months ago
- Tzafon-WayPoint is a robust, scalable solution for managing large fleets of browser instances. WayPoint stands out with unmatched cold‑st…☆55Updated 3 weeks ago
- ☆10Updated 10 months ago
- ☆129Updated last month
- A Collection of Competitive Text-Based Games for Language Model Evaluation and Reinforcement Learning☆156Updated this week
- The history files when recording human interaction while solving ARC tasks☆109Updated this week
- Steer LLM outputs towards a certain topic/subject and enhance response capabilities using activation engineering by adding steering vecto…☆235Updated 3 months ago
- A subset of jailbreaks automatically discovered by the Haize Labs haizing suite.☆91Updated last month
- Use the OpenAI Batch tool to make async batch requests to the OpenAI API.☆98Updated last year
- ☆78Updated this week
- ☆48Updated last year