emooreatx / EthicsEngineLinks
☆14Updated 4 months ago
Alternatives and similar repositories for EthicsEngine
Users that are interested in EthicsEngine are comparing it to the libraries listed below
Sorting:
- ☆14Updated 2 years ago
- You should use PySR to find scaling laws. Here's an example.☆33Updated last year
- An extendible framework for executing benchmarks and computational experiments at scale☆23Updated this week
- Source code for Activated LoRA☆21Updated this week
- Public repository containing METR's DVC pipeline for eval data analysis☆108Updated 5 months ago
- MLX implementation of GCN, with benchmark on MPS, CUDA and CPU (M1 Pro, M2 Ultra, M3 Max).☆24Updated last year
- The application is a end-user training and evaluation system for standard knowledge graph embedding models. It was developed to optimise …☆18Updated 3 months ago
- ☆19Updated last month
- FMS Model Optimizer is a framework for developing reduced precision neural network models.☆20Updated 2 weeks ago
- Docker image NVIDIA GH200 machines - optimized for vllm serving and hf trainer finetuning☆48Updated 7 months ago
- train with kittens!☆62Updated 10 months ago
- Clean RL implementation using MLX☆32Updated last year
- A platform for Interactive AI-assisted Hypothesis Generation [ACL 2025]☆21Updated last month
- A framework for few-shot evaluation of autoregressive language models.☆12Updated 2 months ago
- This is the repository holding code and data for "FrugalML: How to Use ML Prediction APIs More Accurately and Cheaply".☆35Updated 4 years ago
- Automation Framework using LLM-as-a-judge to Scale Eval of Gen AI solutions (RAG, Multi-turn, Query Rewrite, Text2SQL etc.); that is a go…☆32Updated 8 months ago
- The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in languag…☆96Updated last month
- Jax like function transformation engine but micro, microjax☆33Updated 10 months ago
- LeanAgent is a novel lifelong learning framework for formal theorem proving that continuously generalizes to and improves on ever-expandi…☆36Updated 3 months ago
- A tool for benchmarking LLMs on Modal☆43Updated 3 weeks ago
- Experiments to assess SPADE on different LLM pipelines.☆17Updated last year
- Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…☆95Updated this week
- Source code of "How to Correctly do Semantic Backpropagation on Language-based Agentic Systems" 🤖☆75Updated 9 months ago
- ☆45Updated last year
- Analysis code for paper "SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks"☆52Updated last month
- Open sourced predictions, execution logs, trajectories, and results from model inference + evaluation runs on the SWE-bench task.☆15Updated last year
- ACPBench: Reasoning about Action, Change, and Planning☆26Updated last month
- ☆25Updated 3 months ago
- IBM development fork of https://github.com/huggingface/text-generation-inference☆61Updated this week
- Repository of machine learning benchmarks☆42Updated this week