Deepmark AI enables a unique testing environment for language models (LLM) assessment on task-specific metrics and on your own data so your GenAI-powered solution has predictable and reliable performance.
☆104Nov 24, 2023Updated 2 years ago
Alternatives and similar repositories for deepmark
Users that are interested in deepmark are comparing it to the libraries listed below
Sorting:
- Analyze your image in seconds with AI☆63May 28, 2024Updated last year
- Neum AI is a best-in-class framework to manage the creation and synchronization of vector embeddings at large scale.☆867Jan 15, 2024Updated 2 years ago
- LLM evaluation.☆16Nov 7, 2023Updated 2 years ago
- Automate UI testing + functionality testing with GPT-4 Vision☆45Dec 17, 2023Updated 2 years ago
- Awesome material(papers, tools, etc.) about testing machine learning system, including deep learning system.☆47Oct 12, 2021Updated 4 years ago
- A Kurtosis package for Python data engineers, deploying a Jupyter notebook along with a configurable set of databases, and a visualizatio…☆109Dec 4, 2023Updated 2 years ago
- Explore GitHub repositories with natural language questions☆98Dec 22, 2024Updated last year
- AI Research Agent - Search, Scrape, Summarize, Analysis☆15Nov 11, 2023Updated 2 years ago
- An end-to-end benchmark suite of multi-modal DNN applications for system-architecture co-design☆22Dec 13, 2024Updated last year
- ⚡ GUI for editing LLM vector embeddings. No more blind chunking. Upload content in any file extension, join and split chunks, edit metada…☆229Nov 21, 2023Updated 2 years ago
- This project is designed to make it more efficient to collate data from a Youtube Channel to create custom GPTs, train models or for use …☆14Dec 2, 2023Updated 2 years ago
- AI Developer is an AI agent powered by GPT-4-Turbo that's using custom E2B Sandbox☆54Feb 11, 2025Updated last year
- Documentation for the Krixik Python client.☆38Nov 8, 2024Updated last year
- Repository for NPHardEval, a quantified-dynamic benchmark of LLMs☆63Mar 26, 2024Updated last year
- ☆11Aug 28, 2023Updated 2 years ago
- A complete guide to evaluate LLMs and RAGs. Both theory and code based approaches covered.☆29Nov 16, 2023Updated 2 years ago
- Praetor is a lightweight finetuning data and prompt management tool☆67Nov 16, 2024Updated last year
- A fast and minimal framework for building agentic systems☆473Feb 20, 2026Updated last month
- Simplistic and minimalist storage.☆24May 11, 2025Updated 10 months ago
- Project dedicated to explore the capabilities of LLMRouterChain☆22Jul 30, 2023Updated 2 years ago
- Scalable Meta-Evaluation of LLMs as Evaluators☆43Feb 15, 2024Updated 2 years ago
- HeroML is an AI Prompt Chain/Workflow interpreter for Apps built on https://hero.page☆55Aug 10, 2023Updated 2 years ago
- Ipython notebook copy of Andrej Karpathy's llama2.c☆23Sep 5, 2023Updated 2 years ago
- Proteus is an experimental platform that combines the power of Large Language Models with the Genesis physics engine☆26Dec 20, 2024Updated last year
- Large language model evaluation and workflow framework from Phase AI.☆460Jan 21, 2025Updated last year
- a WIP architecture designed to allow transformers to think in a manner without tokens☆20Apr 12, 2024Updated last year
- POC of a phone used as SMS gateway to serve queries to chatGPT over GSM network using the regular Android message app.☆19Jul 18, 2023Updated 2 years ago
- Clint LLM GitHub Pages☆83Sep 12, 2024Updated last year
- Python SDK for running evaluations on LLM generated responses☆298Jun 6, 2025Updated 9 months ago
- With just Llama-2, generate full React codebases from a single prompt☆103Aug 25, 2023Updated 2 years ago
- Your toolkit for autonomous, evolving agent ecosystems. Create, execute, govern, and evolve agents that learn from experience, collaborat…☆448Nov 24, 2025Updated 3 months ago
- assign color hues to a collection of text fragments based on embeddings☆20Jun 15, 2024Updated last year
- Flowchart-like UI to interconnect LLM's and Huggingface models, and deploy them as a REST API with little to no code.☆72Mar 21, 2025Updated 11 months ago
- A simple "Be My Eyes" web app with a llama.cpp/llava backend☆493Nov 28, 2023Updated 2 years ago
- Just a bunch of benchmark logs for different LLMs☆119Jul 28, 2024Updated last year
- Wraps openai.ChatCompletion to produce pydantic model output via schema prompt and error feedback.☆54Jun 4, 2023Updated 2 years ago
- Evaluating LLMs with CommonGen-Lite☆95Mar 21, 2024Updated last year
- Build, Improve Performance, and Productionize your LLM Application with an Integrated Framework☆342Nov 26, 2024Updated last year
- Create a QnA bot on a pdf☆16May 27, 2023Updated 2 years ago